Re: Point in Time Recovery - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Point in Time Recovery
Date
Msg-id 200407281619.i6SGJoG27790@candle.pha.pa.us
Whole thread Raw
In response to Re: Point in Time Recovery  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Oh, here is something else we need to add --- a GUC to control whether
pg_xlog is clean on recovery start.

---------------------------------------------------------------------------

Tom Lane wrote:
> Bruce and I had another phone chat about the problems that can ensue
> if you restore a tar backup that contains old (incompletely filled)
> versions of WAL segment files.  While the current code will ignore them
> during the recovery-from-archive run, leaving them laying around seems
> awfully dangerous.  One nasty possibility is that the archiving
> mechanism will pick up these files and overwrite good copies in the
> archive area with the obsolete ones from the backup :-(.
> 
> Bruce earlier proposed that we simply "rm pg_xlog/*" at the start of
> a recovery-from-archive run, but as I said I'm scared to death of code
> that does such a thing automatically.  In particular this would make it
> impossible to handle scenarios where you want to do a PITR recovery but
> you need to use some recent WAL segments that didn't make it into your
> archive yet.  (Maybe you could get around this by forcibly transferring
> such segments into the archive, but that seems like a bad idea for
> incomplete segments.)
> 
> It would really be best for the DBA to make sure that the starting
> condition for the recovery run does not have any obsolete segment files
> in pg_xlog.  He could do this either by setting up his backup policy so
> that pg_xlog isn't included in the tar backup in the first place, or by
> manually removing the included files just after restoring the backup,
> before he tries to start the recovery run.
> 
> Of course the objection to that is "what if the DBA forgets to do it?"
> 
> The idea that we came to on the phone was for the postmaster, when it
> enters recovery mode because a recovery.conf file exists, to look in
> pg_xlog for existing segment files and refuse to start if any are there
> --- *unless* the user has put a special, non-default overriding flag
> into recovery.conf.  Call it "use_unarchived_files" or something like
> that.  We'd have to provide good documentation and an extensive HINT of
> course, but basically the DBA would have two choices when he gets this
> refusal to start:
> 
> 1. Remove all the segment files in pg_xlog.  (This would be the right
> thing to do if he knows they all came off the backup.)
> 
> 2. Verify that pg_xlog contains only segment files that are newer than
> what's stored in the WAL archive, and then set the override flag in
> recovery.conf.  In this case the DBA is taking responsibility for
> leaving only segment files that are good to use.
> 
> One interesting point is that with such a policy, we could use locally
> available WAL segments in preference to pulling the same segments from
> archive, which would be at least marginally more efficient, and seems
> logically cleaner anyway.
> 
> In particular it seems that this would be a useful arrangement in cases
> where you have questionable WAL segments --- you're not sure if they're
> good or not.  Rather than having to push questionable data into your WAL
> archive, you can leave it local, try a recovery run, and see if you like
> the resulting state.  If not, it's a lot easier to do-over when you have
> not corrupted your archive area.
> 
> Comments?  Better ideas?
> 
>             regards, tom lane
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Point in Time Recovery
Next
From: Bruce Momjian
Date:
Subject: Re: [ADMIN] Point in Time Recovery