Home > mailing lists

Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors) - Mailing list pgsql-hackers

From	Jeff Janes
Subject	Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
Date	May 11, 2016 00:36:11
Msg-id	CAMkU=1wknx-aG5k=uyd2PW7nX1HMDoFF_jxOb4mY+jwf815vUg@mail.gmail.com Whole thread
In response to	Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors) (Andres Freund <andres@anarazel.de>)
Responses	Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors) Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)
List	pgsql-hackers

Tree view

On Tue, May 10, 2016 at 4:02 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-05-10 15:53:38 -0700, Jeff Janes wrote:
>>
>> But isn't CreateCheckPoint called at the end of the checkpoint, not
>> the start of it?
>
> No, CreateCheckPoint() does it all.
>
>
> CreateCheckPoint(int flags)
> {
> ...
>     /* 1) determine redo pointer */
>     WALInsertLockAcquireExclusive();
>     curInsert = XLogBytePosToRecPtr(Insert->CurrBytePos);
>     prevPtr = XLogBytePosToRecPtr(Insert->PrevBytePos);
>     WALInsertLockRelease();
> ...
>     /* 2) determine oid */
>     LWLockAcquire(OidGenLock, LW_SHARED);
>     checkPoint.nextOid = ShmemVariableCache->nextOid;
>     if (!shutdown)
>         checkPoint.nextOid += ShmemVariableCache->oidCount;
>     LWLockRelease(OidGenLock);
> ...
>     /* 3) actually checkpoint shared_buffers et al. */
>     CheckPointGuts(checkPoint.redo, flags);
> ...
>     /* 4) log the checkpoint */
>     recptr = XLogInsert(RM_XLOG_ID,
>                         shutdown ? XLOG_CHECKPOINT_SHUTDOWN :
>                         XLOG_CHECKPOINT_ONLINE);
> ...
> }

OK, got it.  I don' t know how I missed the bigger picture of that
function in the first place.

But, another perhaps stupid question, why do we care what the value of
nextOid was at the start of the last successfully completed
checkpoint?  Wouldn't it make more sense to populate that part of the
record at the end, not the start, of the checkpoint?

>
>
>> I don't understand how it could be out-of-date at that point.  But
>> obviously it is.
>
> A checkpoint logically "starts" at 1) in the above abbreviated
> CreateCheckPoint(), that's where recovery starts when starting up from
> that checkpoint. But inbetween 1) and 4) all other backends can continue
> to insert WAL, and it'll be replayed *before* the checkpoint's record
> itself.  That means that if some backend generates a NEXTOID record
> between 2) and 4) (with largers checkpoints we're looking at minutes to
> an hour of time), it's effects will temporarily take effect (as in
> ShmemVariableCache->nextOid is updated), but XLOG_CHECKPOINT_ONLINE's
> replay will overwrite it unconditionally:
> void
> xlog_redo(XLogReaderState *record)
> {
>     else if (info == XLOG_CHECKPOINT_ONLINE)
>     {
> ...
>         /* ... but still treat OID counter as exact */
>         LWLockAcquire(OidGenLock, LW_EXCLUSIVE);
>         ShmemVariableCache->nextOid = checkPoint.nextOid;
>         ShmemVariableCache->oidCount = 0;
>         LWLockRelease(OidGenLock);
>
> Makes sense?

So it seems like we should unconditionally **not** do that when
replaying XLOG_CHECKPOINT_ONLINE, right?

If this is the checkpoint record at which we started the system then
we would have done that "replay" into ShmemVariableCache->nextOid
during StartupXLOG(), there is no reason to do it again when redo
passes through that record the 2nd time.

But, to round this out, all of this is formally only a hint on where
to start OIDs so as to avoid performance problems as you busy-loop
looking for an unused OID.  The real correctness bug is in the
hint-bit problem you discuss elsewhere that turns collisions into
corruption, and this bug just makes that other one reachable?

Cheers,

Jeff

pgsql-hackers by date:

From: Andres Freund
Date: 11 May 2016, 00:23:12
Subject: Re: asynchronous and vectorized execution

From: Andres Freund
Date: 11 May 2016, 00:50:23
Subject: Re: asynchronous and vectorized execution

Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors) - Mailing list pgsql-hackers

Previous

Next