Re: Why we really need timelines *now* in PITR - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Why we really need timelines *now* in PITR
Date
Msg-id 1090279679.28049.684.camel@stromboli
Whole thread Raw
In response to Re: Why we really need timelines *now* in PITR  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Why we really need timelines *now* in PITR
List pgsql-hackers
On Mon, 2004-07-19 at 23:15, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > On Mon, 2004-07-19 at 19:33, Tom Lane wrote:

> >> * When we need to do recovery, we first identify the source timeline
> >> (either by reading the current timeline ID from pg_control, or the DBA
> >> can tell us with a parameter in recovery.conf).
> 
> > ** Surely it is the backup itself that determines the source timeline? 
> 
> The backup determines the starting point, but there may be several
> timelines you could follow after that (especially in the scenario where
> you're redoing a recovery starting from the same backup).  The point
> here is that there could be timeline branches after the backup
> occurred.  So yes the backup has to be in an ancestral timeline, but not
> necessarily exactly the recovery-target timeline.
> 

Agreed.

> > ...thinking....recovery.conf would need to specify:
> > recovery_target (if there is one, either a time or txnid)
> > recovery_target_timeline (if there is one, otherwise end of last one)
> > recovery_target_history_file (which specifies how the timeline ids are
> > sequenced)
> 
> No, the source timeline is not necessarily associated with a
> recovery_target --- for instance you might want it to run to the end of
> a particular timeline.  I suspect it might be more confusing than
> helpful to use the term "target timeline".
> 

I think we're heatedly agreeing again.

A summary: we don't specify the start timeline, but we do specify the
timeline which contains our chosen endpoint. [....But when we reach it,
we may create a new timeline id if we didn't go to end of logs on that
timeline.] The history file specifies how to get from start to end,
through however many branchpoints there are....and the history file we
use for recovery is the one pointed to by (target_in_timeline).

Or even shorter:
- backup specifies starting timeline (and so user specifies indirectly)
- user specifies end point (explicitly in recovery.conf)
- history file shows how to get from start to end

more thoughts...if you specify:
target = X
target_in_timeline

where the default is <notarget> and if you specify a target, the default
target_in_timeline is <latest>.

I don't like the name target_in_timeline, I'm just trying to clarify
what we mean so we can think of a better name for it.


...we definitely need an offline-log inspection tool, don't we? 
Next month...

> We will need to recommend to DBAs that they not delete Y.history from
> the archive unless they've already deleted all Y.whatever log segments.
> Once they have done this, the past existence of timeline Y is no longer
> of interest and so there'd be no real problem in recycling the ID.
> I would say the above is just as true if you use random IDs as if you
> use sequential ones.  I distrust systems that assume there will never be
> a collision of "randomly-chosen" IDs.
> 

Yes, I argued myself in a circle, but it seemed worth recording just to
avoid repeating the thought next time.


Best regards, Simon Riggs



pgsql-hackers by date:

Previous
From: "Gavin M. Roy"
Date:
Subject: localhost redux
Next
From: Tom Lane
Date:
Subject: Re: Why we really need timelines *now* in PITR