Re: BUG #8686: Standby could not restart. - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #8686: Standby could not restart.
Date
Msg-id 52B46D0D.2070505@vmware.com
Whole thread Raw
In response to BUG #8686: Standby could not restart.  (katsumata.tomonari@po.ntts.co.jp)
Responses Re: BUG #8686: Standby could not restart.  (Tomonari Katsumata <t.katsumata1122@gmail.com>)
List pgsql-bugs
On 12/19/2013 04:57 AM, katsumata.tomonari@po.ntts.co.jp wrote:
> At first, I doubted the recovery state reached "consistent" before redo
> starts.
> And then I checked pg_control and related WAL.
> The WAL sequence is like below.
>
>
> WAL--(a)--(b)--(c)--(d)--(e)-->
> ================================================
> (a) Latest checkpoint's REDO location
> 1/783B230
>
>
> (b) hot_update
> 1/7842010
>
>
> (c) truncate
> 1/8E7E5C8
>
>
> (d) Latest checkpoint location
> 1/8E7F0B0
>
>
> (e) Minimum recovery ending location
> 1/8E7F110
> ================================================
>
>
>>From these things, I found it has happened with this scenario.
> ----------
> (1) standby starting
> (2) seeking checkpoint location 1/8E7F0B0 because backup_label is not
> absecnt
> (3) reachedConsistency is set to true at 1/8E7F110 in
> CheckRecoveryConsistent
> (4) redo start from 1/783B230
> (5) PANIC at 1/7842010 because reachedConsistency has set already and
> operating against a block which will be truncated at (c).
> ----------
>
> At step(2), EndRecPtr is set to 1/8E7F110(next to 1/8E7F0B0),
> so reachedConsistency is set to true at step(3).

Yep. Thanks for a good explanation.

> I think it's not need to increase EndRecPtr while seeking checkpoint
> location.
> I tried to revise it and this worked fine.

Hmm. There's this comment in StartupXLOG, after reading the checkpoint
record, but before reading the first record at REDO point:


>         /*
>          * Initialize shared replayEndRecPtr, lastReplayedEndRecPtr, and
>          * recoveryLastXTime.
>          *
>          * This is slightly confusing if we're starting from an online
>          * checkpoint; we've just read and replayed the checkpoint record, but
>          * we're going to start replay from its redo pointer, which precedes
>          * the location of the checkpoint record itself. So even though the
>          * last record we've replayed is indeed ReadRecPtr, we haven't
>          * replayed all the preceding records yet. That's OK for the current
>          * use of these variables.
>          */
>         SpinLockAcquire(&xlogctl->info_lck);
>         xlogctl->replayEndRecPtr = ReadRecPtr;
>         xlogctl->lastReplayedEndRecPtr = EndRecPtr;
>         xlogctl->recoveryLastXTime = 0;
>         xlogctl->currentChunkStartTime = 0;
>         xlogctl->recoveryPause = false;
>         SpinLockRelease(&xlogctl->info_lck);

I think we need to fix that confusion. Your patch will do it by not
setting EndRecPtr yet; that fixes the bug, but leaves those variables in
a slightly strange state; I'm not sure what EndRecPtr points to in that
case (0 ?), but ReadRecPtr would be set I guess.

Perhaps we should reset replayEndRecPtr and lastReplayedEndRecPtr to the
REDO point here, instead of ReadRecPtr/EndRecPtr.

- Heikki

pgsql-bugs by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: BUG #8139: initdb: Misleading error message when current user not in /etc/passwd
Next
From: Heikki Linnakangas
Date:
Subject: Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages