Home > mailing lists

Re: BUG #7710: Xid epoch is not updated properly during checkpoint - Mailing list pgsql-bugs

From	Andres Freund
Subject	Re: BUG #7710: Xid epoch is not updated properly during checkpoint
Date	December 1, 2012 22:59:33
Msg-id	20121201225921.GA25134@awork2.anarazel.de Whole thread Raw
In response to	BUG #7710: Xid epoch is not updated properly during checkpoint (tarvip@gmail.com)
List	pgsql-bugs

Tree view

Hi all,

On 2012-11-27 19:52:09 +0000, tarvip@gmail.com wrote:
> This happens only if wal_level=hot_standby.
>
>
> Here are the steps to reproduce this issue.

Oh. Fucking. Wow.

I think this tiny comment just helped me find the bug. After previously
having looked for it without success for some time I just recognized the
problem while working on something unrelated.

Here it goes:

void
CreateCheckPoint(int flags)
{
...
    /*
     * Get the other info we need for the checkpoint record.
     */
    LWLockAcquire(XidGenLock, LW_SHARED);
    checkPoint.nextXid = ShmemVariableCache->nextXid;
    checkPoint.oldestXid = ShmemVariableCache->oldestXid;
    checkPoint.oldestXidDB = ShmemVariableCache->oldestXidDB;
    LWLockRelease(XidGenLock);

    /* Increase XID epoch if we've wrapped around since last checkpoint */
    checkPoint.nextXidEpoch = ControlFile->checkPointCopy.nextXidEpoch;
    if (checkPoint.nextXid < ControlFile->checkPointCopy.nextXid)
        checkPoint.nextXidEpoch++;
// i.e. compute the epoch based on the *current* nextXid.

...
// do all the writing, take some time
        CheckPointGuts(checkPoint.redo, flags);
...
     * Update checkPoint.nextXid since we have a later value
     */
    if (!shutdown && XLogStandbyInfoActive())
        LogStandbySnapshot(&checkPoint.nextXid);
...
    /* Update shared-memory copy of checkpoint XID/epoch */
    {
        /* use volatile pointer to prevent code rearrangement */
        volatile XLogCtlData *xlogctl = XLogCtl;

        SpinLockAcquire(&xlogctl->info_lck);
        xlogctl->ckptXidEpoch = checkPoint.nextXidEpoch;
        xlogctl->ckptXid = checkPoint.nextXid;
        SpinLockRelease(&xlogctl->info_lck);
    }
...
}

Notice the end of the comment above/about LogStandbySnapshot. It updates
nextXid! But it does *not* recheck whether we have had a wraparound +
so it does not increase the epoch counter. Although we might have had a
wraparound inbetween.

Which in turn means the the ShmemVariableCache->nextXid <
xlogctl->ckptXid computation in GetNextXidAndEpoch doesn't return
correct results anymore because it can't recognize that were in a new
epoch now.

Trivial patch attached.

I am not sure that I understand why its interesting to have a newer
->nextXid in the checkpoint, but I don't see anything that would
make it dangerous besides this.

This looks like it also explains #6291 and the slighly different issue
described in CAAZKuFbB7UR3NXV1pkZFRXy=6V1QBq_OeHJWJNTLpKBpH=QFgg@mail.gmail.com,
as the issue is just as present on standbys as it is on primaries.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0001-Fix-xid-epoch-calculation-with-wal_level-hot_standby.patch

pgsql-bugs by date:

From: Jeff Janes
Date: 01 December 2012, 22:59:14
Subject: Re: PITR potentially broken in 9.2

From: Andres Freund
Date: 01 December 2012, 23:10:31
Subject: Re: BUG #7710: Xid epoch is not updated properly during checkpoint

Re: BUG #7710: Xid epoch is not updated properly during checkpoint - Mailing list pgsql-bugs

Attachment

Previous

Next