Re: Incorrect logic in XLogNeedsFlush() - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Incorrect logic in XLogNeedsFlush()
Date
Msg-id CAFiTN-tY8YbES0E9+xdm4XqsP=K5G5H8m39er5CSm0gyNGEJUw@mail.gmail.com
Whole thread Raw
In response to Incorrect logic in XLogNeedsFlush()  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
On Sat, Aug 30, 2025 at 4:16 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Hi,
>
> If you call XLogNeedsFlush() immediately after calling XLogFlush() in
> FlushBuffer(), it can return true.
>
> With this diff:
>
> diff --git a/src/backend/storage/buffer/bufmgr.c
> b/src/backend/storage/buffer/bufmgr.c
> index 350cc0402aa..91c3fe99d6e 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -4342,7 +4342,11 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln,
> IOObject io_object,
>      * skip the flush if the buffer isn't permanent.
>      */
>     if (buf_state & BM_PERMANENT)
> +   {
>         XLogFlush(recptr);
> +       if(XLogNeedsFlush(recptr))
> +           elog(ERROR, "should be flushed and isn't");
> +   }
>
> recovery/015_promotion_pages errors out.
>
> During an end-of-recovery checkpoint, XLogNeedsFlush() should compare
> the provided LSN to the flush pointer and not the min recovery point.

That right, "end-of-recovery checkpoint" actually performs a flush
instead of updating the minRecoveryPoint so we should compare it with
flush pointer.

> This only errors out during an end-of-recovery checkpoint after a
> crash when the ControlFile->minRecoveryPoint has not been correctly
> updated. In this case, LocalMinRecoveryPoint is initialized to an
> incorrect value potentially causing XLogNeedsFlush() to incorrectly
> return true.
>
> This trivial diff makes the test pass:
>
> @@ -3115,7 +3125,7 @@ XLogNeedsFlush(XLogRecPtr record)
>      * instead. So "needs flush" is taken to mean whether minRecoveryPoint
>      * would need to be updated.
>      */
> -   if (RecoveryInProgress())
> +   if (RecoveryInProgress() && !XLogInsertAllowed())
>     {
>
> However, since all of this code is new to me, there are some remaining
> things I don't understand:
>
> Why is it okay for other processes than the startup process to
> initialize LocalMinRecoveryPoint from ControlFile->minRecoveryPoint
> during crash recovery?

During crash recovery we don't yet know the minRecoveryPoint until we
replay all the WALs and once it is done it will be updated in the
ControlFile at PerformRecoveryXLogAction()->CreateEndOfRecoveryRecord().
After this point it will be valid to use by any process including the
StartupProcess.  So I don't think until ControlFile->minRecoveryPoint
is initialized it is safe to use for any process.  In fact, before any
connections are allowed this has to be set either by
CreateEndOfRecoveryRecord() if recovering a primary from crash or by
ReachedEndOfBackup() if you are starting a standby from backup.

> Besides this question, I find the use of the global variable
> InRecovery as a proxy for RecoveryInProgress() + "I'm the startup
> process" to be confusing. It seems like UpdateMinRecoveryPoint() and
> XLogNeedsFlush() could just test MyBackendType == B_STARTUP and make
> it explicit.

IIUC InRecovery is slightly different than "RecoveryInProgress() +
"I'm the startup", in StartupXlog, we first set InRecovery to false
and then update the minRecoveryPoint in CreateEndOfRecoveryRecord()
and after that we clear RecoveryInProgress (i.e. set
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;).

--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Sequence Access Methods, round two
Next
From: Michael Paquier
Date:
Subject: Re: Remove traces of long in dynahash.c