Re: Implement waiting for wal lsn replay: reloaded - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Implement waiting for wal lsn replay: reloaded
Date
Msg-id jzq5shdewncpxc35r3s2mcfsmo4bjovkza5mnqf5bdfumhfi3g@bglckf7dxmw5
Whole thread
In response to Re: Implement waiting for wal lsn replay: reloaded  (Andres Freund <andres@anarazel.de>)
Responses Re: Implement waiting for wal lsn replay: reloaded
List pgsql-hackers
Hi,

On 2026-04-06 23:07:45 -0400, Andres Freund wrote:
> But, leaving that aside, looking at this code I'm somewhat concerned - it
> seems to not worry at all about memory ordering?
> 
> 
> static void
> XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr, TimeLineID tli)
> ...
>     /* Update shared-memory status */
>     pg_atomic_write_u64(&WalRcv->writtenUpto, LogstreamResult.Write);
> 
>     /*
>      * If we wrote an LSN that someone was waiting for, notify the waiters.
>      */
>     if (waitLSNState &&
>         (LogstreamResult.Write >=
>          pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_WRITE])))
>         WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_WRITE, LogstreamResult.Write);
> 
> There are no memory barriers here, so the CPU would be entirely free to not
> make the writtenUpto write visible to a waiter that's in the process of
> registering and is checking whether it needs to wait in WaitForLSN().
> 
> And WaitForLSN()->GetCurrentLSNForWaitType()->GetWalRcvWriteRecPtr() also has
> no barriers.  That MAYBE is ok, due addLSNWaiter() providing the barrier at
> loop entry and maybe kinda you can think that WaitLatch() will somehow also
> have barrier semantic.  But if so, that would need to be very carefully
> documented.  And it seems completely unnecessary here, it's hard to believe
> using a barrier (via pg_atomic_read_membarrier_u64() or such) would be a
> performance issue

And separately from the memory ordering, how can it make sense that there's
at least 5 copies of this

        if (waitLSNState &&
            (LogstreamResult.Flush >=
             pg_atomic_read_u64(&waitLSNState->minWaitedLSN[WAIT_LSN_TYPE_STANDBY_FLUSH])))
            WaitLSNWakeup(WAIT_LSN_TYPE_STANDBY_FLUSH, LogstreamResult.Flush);

around?  That needs to be encapsulated so that if you have a bug, like the
memory ordering problem I describe above, it can be fixed once, not in
multiple places.

And why do these callers even have that pre-check?  Seems WaitLSNWakeup()
does so itself?

    /*
     * Fast path check.  Skip if currentLSN is InvalidXLogRecPtr, which means
     * "wake all waiters" (e.g., during promotion when recovery ends).
     */
    if (XLogRecPtrIsValid(currentLSN) &&
        pg_atomic_read_u64(&waitLSNState->minWaitedLSN[i]) > currentLSN)
        return;

And why is the code checking if waitLSNState is non-NULL?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: pgsql: Reduce log level of some logical decoding messages from LOG to D
Next
From: Amit Kapila
Date:
Subject: Re: pgsql: Reduce log level of some logical decoding messages from LOG to D