Re: Where Can I Find The Code Segment For WAL Control? - Mailing list pgsql-hackers
From | Csaba Nagy |
---|---|
Subject | Re: Where Can I Find The Code Segment For WAL Control? |
Date | |
Msg-id | 1142245327.11827.82.camel@coppola.muc.ecircle.de Whole thread Raw |
In response to | Re: Where Can I Find The Code Segment For WAL Control? (Csaba Nagy <nagy@ecircle-ag.com>) |
List | pgsql-hackers |
[Please use "reply to all" so the list is CC-d] Charlie, I guess what you're after is to make sure the WAL buffers are shipped to the stand-by at the same time as they are committed to disk. In any other case your desire to have the stand-by EXACTLY in sync with the primary server will not gonna work. But that would mean that the communication to the stand-by will become a performance bottleneck, as all transactions are only finished after the WAL records for them are synced to the disk. So if you want your stand-by completely in sync with your primary, you will want that the transactions finish only after their WAL records are pushed to the stand-by too... and then if the communication to the stand-by fails, all your transactions will wait after it, possibly causing the primary to stop working properly. So now you have another point of failure, and instead of making the setup safer, you make it unsafer. What I want to say is that it is likely not feasible to keep the stand-by completely in sync. In practice it is enough to keep the standby NEARLY in sync with the primary server. That means you will ship the WAL records asynchronously, i.e. after they are written to the disk, and in a separate thread. What I'm after is to have a thread which starts streaming the current WAL file, and keeps streaming it as it grows. I'm not completely sure how I'll implement that, but I guess it will need to do a loop and transfer whatever records are available, and then sleep a few seconds if it reaches the end. It must be prepared to stumble upon partially written WAL records, and sleep on those too. On the stand-by end, the current partial WAL will not be used unless the stand-by is fired up... So I'm after a solution which makes sure the stand-by is as up to date as possible, with a few seconds allowed gap in normal operation, and possibly more if the communication channel has bandwidth problems and the server is very busy. Usually if the server crashes, than there are worse problems than the few seconds/minutes worth of lost transactions. To name one, if the server crashes you will have for sure at least a few minutes of downtime. At least for our application, downtime in a busy period is actually worse than the lost data (that we can recover from other logs)... Cheers, Csaba. On Sun, 2006-03-12 at 02:50, 王宝兵 wrote: > > Csaba: > > > > Firstly I must thank you for your help.Some of our designs are identical > except the following: > > > > - create a standby manager program which only needs to know how to > > Access the primary server in order to create the standby (by connecting > > To it through normal data base connections and using the above mentioned > > Functions to stream the files); > > > > In my opinion,if we create a standby manager program and run it as a daemon > process,it will check the state of the WAL files of the Principal every few > seconds.But there is a risk for data lost.For an instance,if the Principal > has flushed its log buffer to the disk and the dirty data are also flushed > immediately,but the standby manager program is running in its interval.Then > the Principal fails.In this situation,the Principal has updated its database > but the log segment hasn't been sent to the Mirror,because the time point > for the standby manager program to check the WAL files hasn't come.And then > these data are lost. > > > > I think this situation will happen very probably in a big cooperation and it > s very serious. > > > > Perhaps I have misunderstood your opinion.If that,I apologize. > > > > Charlie Wang > > > > > > > > > > > >
pgsql-hackers by date: