On Wed, Apr 8, 2026 at 7:59 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > Patch 0001 looks OK for me.
> > > Regarding patch 0002. Changes made for GetCurrentLSNForWaitType()
> > > looks reliable for me. PerformWalRecovery() sets replayed positions
> > > before starting recovery, and in turn before standby can accept
> > > connections. So, changes to WalReceiverMain() don't look necessary to
> > > me.
> >
> > Yeah, GetCurrentLSNForWaitType seems to be the right place to place
> > the fix. Please see the attached patch 2.
> >
> > I also noticed another relevent problem:
> >
> > During pure archive recovery (no walreceiver), a backend that issues
> > 'WAIT FOR LSN ... MODE 'standby_write' with a target ahead of the
> > current replay position will sleep forever; the startup process
> > replays past the target but only wakes 'STANDBY_REPLAY' waiters.
> >
> > This also affects mixed scenarios: the walreceiver may lag behind
> > replay (e.g., archive restore has delivered WAL faster than
> > streaming), so a 'standby_write' waiter could be waiting on WAL that
> > replay has already consumed.
> >
> > I will write a patch to address this soon.
> >
>
> Here is the patch.
I've assembled all the pending patches together.
0001 adds memory barrier to GetWalRcvWriteRecPtr() as suggested by
Andres off-list.
0002 is basically [1] by Xuneng, but revised given we have a memory
barrier in 0001, and my proposal to do ResetLatch() unconditionally
similar to our other Latch-based loops.
0003 and 0004 are [2] by Xuneng.
0005 is [3] by Xuneng.
I'm going to add them to Commitfest to run CI over them, and have a
closer look over them tomorrow.
Links.
1. https://www.postgresql.org/message-id/CABPTF7Wjk_FbOghyr09Rzu6T2bh-L_KBMqHK%2BzhRXpssU0STyQ%40mail.gmail.com
2. https://www.postgresql.org/message-id/CABPTF7X0iV%3DkGC4gjsTj4NvK_NNEJGM3YTc7Obxs5GOiYoMhEw%40mail.gmail.com
3. https://www.postgresql.org/message-id/CABPTF7UBdEfyxATWntmCfoJrwB6iPrnhkXO7y_Avmqc2bOn27A%40mail.gmail.com
------
Regards,
Alexander Korotkov
Supabase