Re: Implement waiting for wal lsn replay: reloaded - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Implement waiting for wal lsn replay: reloaded
Date
Msg-id CABPTF7WPRVJGdDeuWo-=3csnDs5FGfUsg97xppPyzoj9fRAMeQ@mail.gmail.com
Whole thread Raw
In response to Re: Implement waiting for wal lsn replay: reloaded  (Andres Freund <andres@anarazel.de>)
Responses Re: Implement waiting for wal lsn replay: reloaded
Re: Implement waiting for wal lsn replay: reloaded
List pgsql-hackers
Hi Andres,

On Tue, Apr 7, 2026 at 11:31 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2026-04-06 23:07:45 -0400, Andres Freund wrote:
> > But, leaving that aside, looking at this code I'm somewhat concerned - it
> > seems to not worry at all about memory ordering?
> >
> >
> > static void
> > XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr, TimeLineID tli)
> > ...
> >       /* Update shared-memory status */
> >       pg_atomic_write_u64(&WalRcv->writtenUpto, LogstreamResult.Write);
> >
> >       /*
> >        * If we wrote an LSN that someone was waiting for, notify the waiters.
> >        */
> >       if (waitLSNState &&
> >               (LogstreamResult.Write >=
> >                pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_WRITE])))
> >               WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_WRITE, LogstreamResult.Write);
> >
> > There are no memory barriers here, so the CPU would be entirely free to not
> > make the writtenUpto write visible to a waiter that's in the process of
> > registering and is checking whether it needs to wait in WaitForLSN().
> >
> > And WaitForLSN()->GetCurrentLSNForWaitType()->GetWalRcvWriteRecPtr() also has
> > no barriers.  That MAYBE is ok, due addLSNWaiter() providing the barrier at
> > loop entry and maybe kinda you can think that WaitLatch() will somehow also
> > have barrier semantic.  But if so, that would need to be very carefully
> > documented.  And it seems completely unnecessary here, it's hard to believe
> > using a barrier (via pg_atomic_read_membarrier_u64() or such) would be a
> > performance issue

Thanks for pointing this out.  This is indeed a store-load ordering issue.

> And separately from the memory ordering, how can it make sense that there's
> at least 5 copies of this
>
>                 if (waitLSNState &&
>                         (LogstreamResult.Flush >=
>                          pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_FLUSH])))
>                         WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_FLUSH, LogstreamResult.Flush);
>
> around?  That needs to be encapsulated so that if you have a bug, like the
> memory ordering problem I describe above, it can be fixed once, not in
> multiple places.

Yeah, this duplication is not ok.

> And why do these callers even have that pre-check?  Seems WaitLSNWakeup()
> does so itself?
>
>         /*
>          * Fast path check.  Skip if currentLSN is InvalidXLogRecPtr, which means
>          * "wake all waiters" (e.g., during promotion when recovery ends).
>          */
>         if (XLogRecPtrIsValid(currentLSN) &&
>                 pg_atomic_read_u64(&waitLSNState->minWaitedLSN[i]) > currentLSN)
>                 return;
>
> And why is the code checking if waitLSNState is non-NULL?
>

These fast checks are unnecessary copy-pastos and waitLSNState checks
also do not make sense except for the one in WaitLSNCleanup.

I'll prepare a patch set addressing them.

--
Best,
Xuneng



pgsql-hackers by date:

Previous
From: jie wang
Date:
Subject: Re: Return DSA area for hash table from GetNamedDSHash()
Next
From: jian he
Date:
Subject: Re: SQL:2011 Application Time Update & Delete