Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves) - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves) |
Date | |
Msg-id | 544F9E9A.9020808@vmware.com Whole thread Raw |
In response to | Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves) (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
|
List | pgsql-hackers |
On 10/27/2014 06:12 PM, Heikki Linnakangas wrote: > On 10/27/2014 02:12 PM, Fujii Masao wrote: >> >On Fri, Oct 24, 2014 at 10:05 PM, Heikki Linnakangas >> ><hlinnakangas@vmware.com> wrote: >>> >>On 10/23/2014 11:09 AM, Heikki Linnakangas wrote: >>>> >>> >>>> >>>At least for master, we should consider changing the way the archiving >>>> >>>works so that we only archive WAL that was generated in the same server. >>>> >>>I.e. we should never try to archive WAL files belonging to another >>>> >>>timeline. >>>> >>> >>>> >>>I just remembered that we discussed a different problem related to this >>>> >>>some time ago, at >>>> >>> >>>> >>>http://www.postgresql.org/message-id/20131212.110002.204892575.horiguchi.kyotaro@lab.ntt.co.jp. >>>> >>>The conclusion of that was that at promotion, we should not archive the >>>> >>>last, partial, segment from the old timeline. >>> >> >>> >> >>> >>So, this is what I came up with for master. Does anyone see a problem with >>> >>it? >> > >> >What about the problem that I raised upthread? This is, the patch >> >prevents the last, partial, WAL file of the old timeline from being archived. >> >So we can never PITR the database to the point that the last, partial WAL >> >file has. > A partial WAL file is never archived in the master server to begin with, > so if it's ever used in archive recovery, the administrator must have > performed some manual action to copy the partial WAL file from the > original server. When he does that, he can also copy it manually to the > archive, or whatever he wants to do with it. > > Note that the same applies to any complete, but not-yet archived WAL > files. But we've never had any mechanism in place to archive those in > the new instance, after PITR. Actually, I'll take back what I said above. I had misunderstood the current behavior. Currently, a server *does* archive any files that you copy manually to pg_xlog, after PITR has finished. Eventually. We don't create a .ready file for them until they're old enough to be recycled. We do create a .ready file for the last, partial, segment, but it's pretty weird to do it just for that, and not any other, complete, segments that might've been copied to pg_xlog. So what happens is that the last partial segment gets archived immediately after promotion, but any older segments will linger unarchived until much later. The special treatment of the last partial segment still makes no sense. If we want the segments from the old timeline to be archived after PITR, we should archive them all immediately after end of recovery, not just the partial one. The exception for just the last partial segment is silly. Now, the bigger question is whether we want the server after PITR to be responsible for archiving the segments from the old timeline at all. If we do, then we should remove the special treatment of the last, partial segment, and create the .ready files for all the complete segments too. And actually, I think we should *not* archive the partial segment. We don't normally archive partial segments, and all the WAL required to restore the server to new timeline is copied to the file with the new TLI. If the old timeline is still live, i.e. there's a server somewhere still writing new WAL on the old timeline, the partial segment will clash with a complete segment that the other server will archive later. Yet another consideration is that we currently don't archive files streamed from the master. If we think that the standby server is responsible for archiving old segments after recovery, why is it not responsible for archiving the streamed segments? It's because in most cases, the master will archive the file, and we don't want two servers to archive the same file, but there is actually no guarantee on that. It might well be that the archiver runs a little bit behind in the master, and after crash the archive will miss some of the segments required. That's not good either. I'm not sure what to do here. The current behavior is inconsistent, and there are a some nasty gotchas that would be nice to fix. I think someone needs to sit down and write a high-level design of how this all should work. - Heikki
pgsql-hackers by date: