Re: [HACKERS] Point in Time Recovery - Mailing list pgsql-patches

From Bruce Momjian
Subject Re: [HACKERS] Point in Time Recovery
Date
Msg-id 200407191635.i6JGZ5A21033@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] Point in Time Recovery  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] Point in Time Recovery
List pgsql-patches
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> * Documentation is, um, lacking.  (One point in particular is that I
> >> inserted the recovery.conf.sample file into CVS, but did not fill in
> >> the patch's lack of attempt to install it anywhere.)
>
> > I figure it should go in share like the other sample files, and tell
> > people to copy it to /data and modify it for recovery.
>
> It should certainly go to /share as a .sample file.  I was thinking that
> initdb should perhaps copy it into $PGDATA (still as .sample, not as
> .conf!) so it'd be right there when you need it.

I think /share is best.  I see other *.share file that aren't used until
you rename them and move them to the right directory, and
recovery.conf.sample seems the same.  I think having the sample at the
top of data when for most people it will be unused is strange.

> >> Perhaps the last point is really a backup-process issue.  AFAICS there
> >> is no good reason for a backup tarfile to include $PGDATA/pg_xlog at
> >> all, and some good reasons for it not to.
>
> > Seems we should just clear out the /pg_xlog directory before we start
> > recovery.
>
> No, that's a horrid idea, because it loses the ability to combine
> archival xlog files with recent files in /pg_xlog that are not yet
> archived.  We need to distinguish old files that were accidentally
> captured by backup from very-recent files.  I think the cleanest way to
> do that is for backup not to capture them in the first place.

I am confused.  Aren't we always doing a restore from a backup?  Are you
saying there are cases where we aren't and need the stuff in pg_xlog?
Are you saying we might have some new WAL files that we want to add to
pg_xlog before we do the restore, like the most recent WAL that wasn't
archived because it wasn't finished?  Why would we be doing a recover if
we had such files?  I see your point that we wouldn't know which file
to use, the archive version or the pg_xlog version, but actually
wouldn't the archive version always be preferred because we would know
it to be complete.

I don't see any reliable way to prevent people from having pg_xlog in
their backups seeing they might use snapshots, tar, etc.

> > We are going to rename recovery.conf to recovery.in-progress
> > or something to prevent us from clearing out the directory after a
> > crash, right?
>
> I had second thoughts about that and didn't do it in the committed
> patch, though it's certainly still open for debate.

How are we handling a crash during recovery?

> > (I see you rename recovery.conf to recovery.done.  Is
> > that wise?
>
> Yes.  Once you've done with a PITR recovery you definitely do *not* want
> a subsequent crash recovery to think it should obey your recovery_target
> limit.  But if you fail before you've finished the recovery run it
> should theoretically be okay to retry, so I didn't add code to rename to
> "recovery.inprogress".  We can certainly add it later if we decide it's
> a good idea.

Ah, OK, so it just keeps going.  However, we don't know if what is in
pg_xlog was in the process of being copied from the archive at the time
of the crash, no?  In fact I am wondering if we should be transfering
the archive files into temporary names than doing an 'mv' to make them
current so we don't get partial files in pg_xlog.  However, we can't do
that because we are using a user-supplied command line.  Should we pass
a fake name to the command string then do the 'mv' ourselves.  With WAL
now, we do an fsync so we know the contents are crash-proof, but I am
not sure how to do that during recovery.  I guess this gets back to how
to handle the contents of pg_xlog during recovery.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

pgsql-patches by date:

Previous
From: Andreas Pflug
Date:
Subject: Re: logfile subprocess and Fancy File Functions
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Point in Time Recovery