Hi all,
On 2012-11-27 19:52:09 +0000, tarvip@gmail.com wrote:
> This happens only if wal_level=hot_standby.
>
>
> Here are the steps to reproduce this issue.
Oh. Fucking. Wow.
I think this tiny comment just helped me find the bug. After previously
having looked for it without success for some time I just recognized the
problem while working on something unrelated.
Here it goes:
void
CreateCheckPoint(int flags)
{
...
/*
* Get the other info we need for the checkpoint record.
*/
LWLockAcquire(XidGenLock, LW_SHARED);
checkPoint.nextXid = ShmemVariableCache->nextXid;
checkPoint.oldestXid = ShmemVariableCache->oldestXid;
checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
LWLockRelease(XidGenLock);
/* Increase XID epoch if we've wrapped around since last checkpoint */
checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
checkPoint.nextXidEpoch++;
// i.e. compute the epoch based on the *current* nextXid.
...
// do all the writing, take some time
CheckPointGuts(checkPoint.redo, flags);
...
* Update checkPoint.nextXid since we have a later value
*/
if (!shutdown && XLogStandbyInfoActive())
LogStandbySnapshot(&checkPoint.nextXid);
...
/* Update shared-memory copy of checkpoint XID/epoch */
{
/* use volatile pointer to prevent code rearrangement */
volatile XLogCtlData *xlogctl = XLogCtl;
SpinLockAcquire(&xlogctl->info_lck);
xlogctl->ckptXidEpoch = checkPoint.nextXidEpoch;
xlogctl->ckptXid = checkPoint.nextXid;
SpinLockRelease(&xlogctl->info_lck);
}
...
}
Notice the end of the comment above/about LogStandbySnapshot. It updates
nextXid! But it does *not* recheck whether we have had a wraparound +
so it does not increase the epoch counter. Although we might have had a
wraparound inbetween.
Which in turn means the the ShmemVariableCache->nextXid <
xlogctl->ckptXid computation in GetNextXidAndEpoch doesn't return
correct results anymore because it can't recognize that were in a new
epoch now.
Trivial patch attached.
I am not sure that I understand why its interesting to have a newer
->nextXid in the checkpoint, but I don't see anything that would
make it dangerous besides this.
This looks like it also explains #6291 and the slighly different issue
described in CAAZKuFbB7UR3NXV1pkZFRXy=6V1QBq_OeHJWJNTLpKBpH=QFgg@mail.gmail.com,
as the issue is just as present on standbys as it is on primaries.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services