Re: Change pg_last_xlog_receive_location not to move backwards - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Re: Change pg_last_xlog_receive_location not to move backwards |
Date | |
Msg-id | 1297531924.1747.3209.camel@ebony Whole thread Raw |
In response to | Re: Change pg_last_xlog_receive_location not to move backwards (Fujii Masao <masao.fujii@gmail.com>) |
List | pgsql-hackers |
On Mon, 2011-01-31 at 16:12 +0900, Fujii Masao wrote: > On Sun, Jan 30, 2011 at 10:44 AM, Jeff Janes <jeff.janes@gmail.com> wrote: > > I do not understand what doing so gets us. > > > > Say we previously received 2/3 of a WAL file, and replayed most of it. > > So now the shared buffers have data that has been synced to that WAL > > file already, and some of those dirty shared buffers have been written > > to disk and some have not. At this point, we need the data in the first > > 2/3 of the WAL file in order to reach a consistent state. But now we > > lose the connection to the master, and then we restore it. Now we > > request the entire file from the start rather than from where it > > left off. > > > > Either of two things happens. Most likely, the newly received WAL file > > matches the file it is overwriting, in which case there was no > > point in asking for it. > > > > Less likely, the master is feeding us gibberish. By requesting the > > full WAL file, we check the header and detect that the master is feeding > > us gibberish. Unfortunately, we only detect that fact *after* we have > > replaced a critical part of our own (previously good) copy of the WAL > > file with said gibberish. The standby is now in an unrecoverable state. > > Right. To avoid this problem completely, IMO, walreceiver should validate > the received WAL data before writing it. Or, walreceiver should write the > WAL to the transient file, and the startup process should rename it to the > correct name after replaying it. > > We should do something like the above? > > > With a bit of malicious engineering, I have created this situation. > > I don't know how likely it is that something like that could happen > > accidentally, say with a corrupted file system. I have been unable > > to engineer a situation where checking the header actually does > > any good. It has either done nothing, or done harm. > > OK, I seem to have to consider again why the code which retreats the > replication starting location exists. > > At first, I added the code to prevent a half-baked WAL file. For example, > please imagine the case where you start the standby server with no WAL > files in pg_xlog. In this case, if replication starts from the middle of WAL > file, the received WAL file is obviously broken (i.e., with no WAL data in > the first half of file). This broken WAL file might cause trouble when we > restart the standby and even when we just promote it (because the last > applied WAL file is re-fetched at the end of recovery). > > OTOH, we can start replication from the middle of WAL file if we can > *ensure* that the first half of WAL file already exists. At least, when the > standby reconnects to the master, we might be able to ensure that and > start from the middle. Some important points here, but it seems complex. AFAICS we need to do two things * check header of WAL file matches when we start streaming * start streaming from last received location So why not do them separately, rather than rewinding the received location and kludging things? Seems easier to send all required info in a separate data packet, then validate the existing WAL file against that. Then start streaming from last received location. That way we don't have any "going backwards" logic at all. -- Simon Riggs http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services
pgsql-hackers by date: