Re: Why we really need timelines *now* in PITR - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Why we really need timelines *now* in PITR
Date
Msg-id 8328.1090207906@sss.pgh.pa.us
Whole thread Raw
In response to Re: Why we really need timelines *now* in PITR  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Why we really need timelines *now* in PITR
List pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> The way you write this makes me think you might mean you would allow: we
> can start recovering in one timelines, then rollforward takes us through
> all the timeline nexus points required to get us to the target
> timeline.

Sure.  Let's draw a diagram:
0001.0014 - 0001.0015 - 0001.0016 - 0001.0017 - ...                      |                      + 0002.0016 - 0002.0017
-...                                  |                                  + 0003.0017 - ...
 

If you decide you would like to recover to someplace in timeline 0002,
you need to take the 0002 log files where they exist, and the 0001
log files where there is no 0002, except you do not revert to 0001
once you have used an 0002 file (this restriction is needed in case
the 0001 timeline goes to higher segment numbers than 0002 has reached).
In no case do you use an 0003 file.

> I had imagined that recovery would only ever be allowed to start and end
> on the same timeline. I think you probably mean that?

Logically it's all one timeline, I suppose, but to implement it
physically that way would mean duplicating all past 0001 segments when
we want to create the 0002 timeline.  That's not practical and not
necessary.

> Another of the issues I was thinking through was what happens at the end
> of your scenario abobe
> - You're on timeline 1 and you need to perform recovery.
> - You perform recovery and timeline 2 is created.
> - You discover another error and decide to recover again.
> - You recover timeline 1 again: what do you name the new timeline
> created? 2 or 3?

You really want to call it 3.  To enforce this mechanically would
require having a counter that sits outside the $PGDATA directory and
is not subject to being reverted by a restore-from-backup.  I don't
see any very clean way to do that at the moment --- any thoughts?

In the absence of such a counter we could ask the DBA to specify a new
timeline number in recovery.conf, but this strikes me as one of those
easy-to-get-wrong things ...

One possibility is to extend the archiving API so that we can inquire
about the largest timeline number that exists anywhere in the archive.
If we take new timeline number = 1 + max(any in archive, any in pg_xlog)
then we are safe.  But I'm not really convinced that such a thing would
be any less error-prone than the manual way :-(, because for any
archival method that's more complicated than "cp them all into one
directory", it'd be hard to extract the max stored filename.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: CVS compile failure
Next
From: Tom Lane
Date:
Subject: Re: pg_dump bug fixing