Re: Change pg_last_xlog_receive_location not to move backwards - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Change pg_last_xlog_receive_location not to move backwards |
Date | |
Msg-id | AANLkTinrwPqu8q1oJ9muc+Z0o4qBJhc1RQXD=xZjDdxF@mail.gmail.com Whole thread Raw |
In response to | Re: Change pg_last_xlog_receive_location not to move backwards (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Change pg_last_xlog_receive_location not to move
backwards
Re: Change pg_last_xlog_receive_location not to move backwards |
List | pgsql-hackers |
On Sun, Jan 30, 2011 at 10:44 AM, Jeff Janes <jeff.janes@gmail.com> wrote: > I do not understand what doing so gets us. > > Say we previously received 2/3 of a WAL file, and replayed most of it. > So now the shared buffers have data that has been synced to that WAL > file already, and some of those dirty shared buffers have been written > to disk and some have not. At this point, we need the data in the first > 2/3 of the WAL file in order to reach a consistent state. But now we > lose the connection to the master, and then we restore it. Now we > request the entire file from the start rather than from where it > left off. > > Either of two things happens. Most likely, the newly received WAL file > matches the file it is overwriting, in which case there was no > point in asking for it. > > Less likely, the master is feeding us gibberish. By requesting the > full WAL file, we check the header and detect that the master is feeding > us gibberish. Unfortunately, we only detect that fact *after* we have > replaced a critical part of our own (previously good) copy of the WAL > file with said gibberish. The standby is now in an unrecoverable state. Right. To avoid this problem completely, IMO, walreceiver should validate the received WAL data before writing it. Or, walreceiver should write the WAL to the transient file, and the startup process should rename it to the correct name after replaying it. We should do something like the above? > With a bit of malicious engineering, I have created this situation. > I don't know how likely it is that something like that could happen > accidentally, say with a corrupted file system. I have been unable > to engineer a situation where checking the header actually does > any good. It has either done nothing, or done harm. OK, I seem to have to consider again why the code which retreats the replication starting location exists. At first, I added the code to prevent a half-baked WAL file. For example, please imagine the case where you start the standby server with no WAL files in pg_xlog. In this case, if replication starts from the middle of WAL file, the received WAL file is obviously broken (i.e., with no WAL data in the first half of file). This broken WAL file might cause trouble when we restart the standby and even when we just promote it (because the last applied WAL file is re-fetched at the end of recovery). OTOH, we can start replication from the middle of WAL file if we can *ensure* that the first half of WAL file already exists. At least, when the standby reconnects to the master, we might be able to ensure that and start from the middle. > OK, thanks for the explanation. Is there a race condition here? That is, > it seems that with your patch this part of the code could get executed > after streaming is restarted, but before the streaming ever actually received > and wrote anything. So it would be checking the old header, not the newly > about-to-be received header. As far as I read the code again, there is no such a race condition. When the received WAL is read just after streaming is restarted, that part of the code seems to always be executed. Maybe I'm missing something. Could you explain about the problematic scenario? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: