Re: pgsql: Fix bug leading to restoring unlogged relations from empty files - Mailing list pgsql-committers

From Peter Eisentraut
Subject Re: pgsql: Fix bug leading to restoring unlogged relations from empty files
Date
Msg-id 566E5308.7060109@gmx.net
Whole thread Raw
In response to pgsql: Fix bug leading to restoring unlogged relations from empty files  (Andres Freund <andres@anarazel.de>)
Responses Re: pgsql: Fix bug leading to restoring unlogged relations from empty files  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-committers
Coverity is complaining that in

void
FlushOneBuffer(Buffer buffer)
{
    BufferDesc *bufHdr;

    /* currently not needed, but no fundamental reason not to support */
    Assert(!BufferIsLocal(buffer));

    Assert(BufferIsPinned(buffer));

    bufHdr = GetBufferDescriptor(buffer - 1);

    LWLockHeldByMe(bufHdr->content_lock);

    FlushBuffer(bufHdr, NULL);
}

the call to LWLockHeldByMe() is useless.


On 12/10/15 10:50 AM, Andres Freund wrote:
> Fix bug leading to restoring unlogged relations from empty files.
>
> At the end of crash recovery, unlogged relations are reset to the empty
> state, using their init fork as the template. The init fork is copied to
> the main fork without going through shared buffers. Unfortunately WAL
> replay so far has not necessarily flushed writes from shared buffers to
> disk at that point. In normal crash recovery, and before the
> introduction of 'fast promotions' in fd4ced523 / 9.3, the
> END_OF_RECOVERY checkpoint flushes the buffers out in time. But with
> fast promotions that's not the case anymore.
>
> To fix, force WAL writes targeting the init fork to be flushed
> immediately (using the new FlushOneBuffer() function). In 9.5+ that
> flush can centrally be triggered from the code dealing with restoring
> full page writes (XLogReadBufferForRedoExtended), in earlier releases
> that responsibility is in the hands of XLOG_HEAP_NEWPAGE's replay
> function.
>
> Backpatch to 9.1, even if this currently is only known to trigger in
> 9.3+. Flushing earlier is more robust, and it is advantageous to keep
> the branches similar.
>
> Typical symptoms of this bug are errors like
> 'ERROR:  index "..." contains unexpected zero page at block 0'
> shortly after promoting a node.
>
> Reported-By: Thom Brown
> Author: Andres Freund and Michael Paquier
> Discussion: 20150326175024.GJ451@alap3.anarazel.de
> Backpatch: 9.1-
>
> Branch
> ------
> REL9_5_STABLE
>
> Details
> -------
> http://git.postgresql.org/pg/commitdiff/5b51805fe4b9fec95274924c5733093a5f685a58
>
> Modified Files
> --------------
> src/backend/access/transam/xlogutils.c |    9 +++++++++
> src/backend/storage/buffer/bufmgr.c    |   21 +++++++++++++++++++++
> src/include/storage/bufmgr.h           |    1 +
> 3 files changed, 31 insertions(+)
>
>



pgsql-committers by date:

Previous
From: Tom Lane
Date:
Subject: pgsql: Docs: document that psql's "\i -" means read from stdin.
Next
From: Michael Paquier
Date:
Subject: Re: pgsql: Fix bug leading to restoring unlogged relations from empty files