Re: MarkBufferDirtyHint() and LSN update - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: MarkBufferDirtyHint() and LSN update
Date
Msg-id 20191111054737.GD1707@paquier.xyz
Whole thread Raw
In response to Re: MarkBufferDirtyHint() and LSN update  (Antonin Houska <ah@cybertec.at>)
Responses Re: MarkBufferDirtyHint() and LSN update  (Antonin Houska <ah@cybertec.at>)
List pgsql-hackers
On Thu, Oct 31, 2019 at 09:43:47AM +0100, Antonin Houska wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> Isn't this prevented by locking of the buffer header? Both FlushBuffer
>> and MarkBufferDirtyHint do obtain that lock. I see MarkBufferDirtyHint
>> does a bit of work before, but that's related to BM_PERMANENT.

In FlushBuffer() you have a window after fetching the buffer state and
the header is once unlocked until TerminateBufferIO() gets called
(this would take again a lock on the buffer header), so it is
logically possible for the buffer to be marked as dirty once again
causing the update of the LSN on the page to be missed even if a
backup block has been written in WAL.

> Yes, I had verified it using gdb: inserted a row into a table, then attached
> gdb to checkpointer and stopped it when FlushBuffer() was about to call
> TerminateBufferIO().  Then, when scanning the table, I saw that
> MarkBufferDirtyHint() skipped the call of PageSetLSN(). Finally, before
> checkpointer unlocked the buffer header in TerminateBufferIO(), buf_state was
> 3553624066 ~ 0b11010011110100000000000000000010.

Small typo here: 11010011110100000000000000000010...

> With BM_DIRTY defined as
>
>     #define BM_DIRTY                (1U << 23)
>
> the flag appeared to be set.

...  Still, the bit is set here.

Does something like the attached patch make sense?  Reviews are
welcome.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: [HACKERS] Block level parallel vacuum
Next
From: Amit Kapila
Date:
Subject: Re: Ordering of header file inclusion