Re: [HACKERS] Point in Time Recovery - Mailing list pgsql-patches

From Simon Riggs
Subject Re: [HACKERS] Point in Time Recovery
Date
Msg-id 1090222505.17493.22360.camel@stromboli
Whole thread Raw
In response to Re: [HACKERS] Point in Time Recovery  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-patches
On Mon, 2004-07-19 at 04:03, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Latest version, pitr_v5_2.patch...
>
> Reviewed and committed with some adjustments.
>

Wow! Thanks very much - you work fast.

I'll be re-testing later today.

> I see the following significant loose ends:
>
> * Documentation is, um, lacking.  (One point in particular is that I
> inserted the recovery.conf.sample file into CVS, but did not fill in
> the patch's lack of attempt to install it anywhere.)
>

Yes...wasn't sure what to do with that. Is everybody happy to install it
as a sample into the main Data Directory? (i.e. as recovery.conf.sample
rather than recovery.conf which would be a bad thing).

> * As Bruce has pointed out already, the process of making a backup
> needs some improvements for more safety: the starting and ending WAL
> offsets have got to be recorded somehow.
>

Haven't got to that yet, but will do.

> * As I have pointed out already, we need to invent "timelines" to
> allow incompatible WAL segments to exist side-by-side.  I will volunteer
> to look into this.

Yes, discussing on the other thread.

>
> * I think creating a .ready file during XLogFileOpen is completely bogus,
> for reasons mentioned in committed comments (look for XXX).  Possibly
> this can go away with timelines.

Yes, to some extent it would go away with timelines.

If you have a local copy at the end of a timeline that isn't archived,
then it seems a good idea to archive it, or at least copy it somewhere
safe. If you don't then you will not be able to revert to a full
recovery of that timeline in the future should you choose to do so.

The code and its location may be somewhat more suspect.... :)

>
> * I am wondering if it wouldn't be a good idea to remove the local copy
> of any segment we successfully obtain from archive.  The existing
> comments note that we might get a wrong or corrupted file from archive,
> but aren't we in at least as much risk of using an obsolete segment
> restored from backup if we leave the local segment in place?  (The
> archive recovery run itself will know not to do this, but if we crash
> shortly thereafter, the ensuing recovery run would NOT know not to
> trust such files.)
>

I agree they're a loose end that needs some thought.

I avoided that decision by going around the files. We originally agreed
that we would keep that data....reason was you can't tell whether the
files have been restored by a backup that forgot to exclude pg_xlog, or
that we are choosing to do a PITR recovery on an otherwise healthy
system (or as the comments explain maybe we lost everything except
pg_xlog).

If we crash during recovery it doesn't crash recover and restart.

If we crash after recovery, then the checkpoint record will have moved
forward and we so we don't then accidentally re-use those local copies.

Timelines will solve this...
>
> Perhaps the last point is really a backup-process issue.  AFAICS there
> is no good reason for a backup tarfile to include $PGDATA/pg_xlog at
> all, and some good reasons for it not to.  Can we redesign either the
> backup process or the disk layout so that that will not happen?  Then
> we could stop worrying about stale local pg_xlog files.
>

Thats the way I saw it.

Seems fairly easy to say "don't backup pg_xlog", but you can't guarantee
they won't, even if you tell them not to...

What is stale today maybe considered to be actually your best option
when testing to see whether a recovery has achieved your objectives.


I'll read the who patch, your comments and test before I respond
further. Thanks for working so hard on this, so quickly.

Best Regards, Simon Riggs



pgsql-patches by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [HACKERS] Point in Time Recovery
Next
From: Andreas Pflug
Date:
Subject: Re: logfile subprocess and Fancy File Functions