Re: Tracking latest timeline in standby mode - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Tracking latest timeline in standby mode
Date
Msg-id 4D237E32.2070204@enterprisedb.com
Whole thread Raw
In response to Re: Tracking latest timeline in standby mode  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Tracking latest timeline in standby mode  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On 02.11.2010 07:15, Fujii Masao wrote:
> On Mon, Nov 1, 2010 at 8:32 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com>  wrote:
>> Yeah, that's one approach. Another is to validate the TLI in the xlog page
>> header, it should always match the current timeline we're on. That would
>> feel more robust to me.
>
> Yeah, that seems better.

I finally got around to look at this. I wrote a patch to validate that 
the TLI on xlog page header matches ThisTimeLineID during recovery, and 
noticed quickly in testing that it doesn't catch all the cases I'd like 
to catch :-(.

The problem scenario is this:


TLI 1 -----------+C-------+------->Standby                 .                 .
TLI 2            +C-------+------->


The two horizontal lines represent two timelines. TLI 2 forks off from 
TLI 1, because of a failover to a not-completely up-to-date standby 
server, for example. The plus-signs represent WAL segment boundaries and 
C's represent checkpoint records.

Another standby server has replayed all the WAL on TLI 2. Its latest 
restartpoint is C. The checkpoint records on the different timelines are 
at the same location, at the beginning of the WAL files - not all that 
impossible if you have archive_timeout set, for example.

Now, if you stop and restart the standby, it will try to recover to the 
latest timeline, which is TLI 2. But before the restart, it had already 
replayed the WAL from TLI 1, so it's wrong to replay the WAL from the 
parallel universe of TLI 2. At the moment, it will go ahead and do it, 
and you end up with an inconsistent database.

I planned to fix that by checking the TLI on the xlog page header, but 
that alone isn't enough in the above scenario. The TLI on the page 
headers on timeline 2 are what's expected; the first page on the segment 
has TLI==1, because it was just forked off from timeline 1, and the 
subsequent pages have TLI==2, as they should after the checkpoint record.

So we have to remember that before the restart, which timeline where we 
on. We already remember how far we had replayed, that's the 
minRecoveryPoint we store in the control file, but we have to memorize 
the timeline along that.

On reflection, your idea of checking the history file before replaying 
anything seems much easier. We'll still need to add the timeline 
alongside minRecoveryPoint to do the checking, but it's a lot easier to 
do against the history file. And we can validate the TLIs on page 
headers against the information from the history file as we read in the WAL.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Upgrading Extension, version numbers
Next
From: "David E. Wheeler"
Date:
Subject: Re: Upgrading Extension, version numbers