Re: How should pg_standby get over the gap of timeline? - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: How should pg_standby get over the gap of timeline?
Date
Msg-id 1227346132.7015.110.camel@hp_dx2400_1
Whole thread Raw
In response to Re: How should pg_standby get over the gap of timeline?  ("Fujii Masao" <masao.fujii@gmail.com>)
Responses Re: How should pg_standby get over the gap of timeline?
List pgsql-hackers
On Sat, 2008-11-22 at 03:39 +0900, Fujii Masao wrote:
> Hi, Simon. Thanks for the comment!!
> 
> On Sat, Nov 22, 2008 at 2:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > On Thu, 2008-11-20 at 22:41 +0900, Fujii Masao wrote:
> >
> >> In the current Synch Rep patch, the standby cannot catch up with the
> >> primary which has a bigger timeline. So, whenever making the standby
> >> catch up, a fresh base backup is required. This is obviously undesirable,
> >> and I'd like to get rid of this restriction.
> >>
> >> Postgres itself can recover up to a bigger timeline without a base
> >> backup. The remaining problem is that pg_standby cannot get over the
> >> gap of timeline. It continues waiting for the XLOG file with out-of-date
> >> timeline, and redo doesn't progress.
> >
> > We've discussed this before. My answer is the same: you are assuming it
> > is safe to re-enter recovery, which is not correct (currently).
> 
> I'm afraid you might be right. But I cannot understand yet why it's not
> safe to re-enter recovery. Is it safe to re-enter recovery from the
> restart point after PITR stopped halfway? If it's safe, ISTM that PITR
> without a base backup also is safe. Please let me know what might
> violate a re-entry of recovery. What is your worry?

My worry is that there has not been an exhaustive analysis. "Almost
correct" and "probably correct" is not the same thing as "correct". We
need to look through all of the changes that occur at the end of
recovery to be certain we can do this. Luckily normal data blocks don't
know anything about such state changes, so that is a good start. We must
look at

Timelines
control file
startupclog, startup multixact etc
autovacuum starting
relcache init file
flat files
archive status
pg_xlog
two phase commit
...
every single file type in Postgres...

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Stefan Kaltenbrunner
Date:
Subject: Re: Cool hack with recursive queries
Next
From: "Pavan Deolasee"
Date:
Subject: Re: Review: Hot standby