Re: Problem with PITR recovery - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Problem with PITR recovery
Date
Msg-id 200504181322.j3IDMEX11712@candle.pha.pa.us
Whole thread Raw
In response to Re: Problem with PITR recovery  (Jeff Davis <jdavis-pgsql@empires.org>)
Responses Re: Problem with PITR recovery
List pgsql-hackers
Jeff Davis wrote:
> On Mon, 2005-04-18 at 00:20 -0400, Bruce Momjian wrote:
> > Jeff Davis wrote:
> > > 
> > > Can you sort of run through the failure case again, and how to prevent
> > > it?
> > 
> > The failure case in the original docs is that you do your
> > pg_stop_backup(), and then delete all the WAL file before the *.backup
> > file that was just created.  However, you do not have a valid tar backup
> > until you have archived all the WAL files used from the *.backup WAL
> > file up to the WAL file that was active at pg_stop_backup(), which is
> > mentioned in the *.backup file.  If you went and deleted your old WAL
> > files anyway, without waiting for those other WAL files to be archived,
> > and your disk drive crashed, you wouldn't have a tar backup you could
> > use, and you had deleted the old WAL files you would have needed to
> > recover your previous tar backup.
> > 
> > Is there something in the current wording that needs clarification?
> > 
> 
> So, as I understand it: everything works great as long as everything has
> been archived up to and including the WAL file that was active when you
> did pg_stop_backup(). However, if you do pg_stop_backup() and
> immediately delete PGDATA (before any WAL files are archived), the
> backup may fail.

Right, and that is the issue that wasn't documented before, and I was
even unclear about it myself when testing initially.

> I think, to clear it up a little, you might add a step 5 before saying
> "If this returns successfully, you're done.", so that people know for

I see your point. New text is:4 Again connect to the database as a superuser, and issue the command    SELECT
pg_stop_backup(); This should return successfully.5 Once the WAL segment files used during the backup are archived as
partof normal database activity, you are done. 
 

> sure that they get a good base backup. It actually seems like something
> that maybe pg_stop_backup() should do in the future.

Yes, I added that to the TODO list:* Force archiving of partially-full WAL files when pg_stop_backup() is  called or
theserver is stopped  Doing this will allow administrators to know more easily when the  archive contins all the files
neededfor point-in-time recovery.
 

> It's a little unclear how you tell which WAL segment was active during
> pg_stop_backup(), but that shouldn't be a practical concern since you
> can just manually archive them all.

We do have this sentence:
Once you have safely archived the WAL segment files used during the filesystem backup (as specified in the backup
historyfile), you can deleteall archived WAL segments with names numerically less.
 

The information is actually in the *.backup file.  I think that is the
only way to know.

And you can't manually copy the WAL files to the archive because they
aren't full and the recommended archive_command will fail if those files
are already in the archive.  You could copy them off somewhere else, I
suppose.

> Maybe step 5 could be something like:
> (5) Make a copy of all WAL segments above XXXX.backup and store with the
> base backup. When it's time to recover, if those WAL segments were not
> properly archived, you need to have them available.

Again, that doesn't work because of the "no overwrite" behavior of the
archive_command.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


pgsql-hackers by date:

Previous
From: ElayaRaja S
Date:
Subject: Urgent
Next
From: Oleg Bartunov
Date:
Subject: Re: Problem with PITR recovery