Re: Implement waiting for wal lsn replay: reloaded - Mailing list pgsql-hackers

From Xuneng Zhou
Subject Re: Implement waiting for wal lsn replay: reloaded
Date
Msg-id CABPTF7WJ35p7uidJJZs7fzxBtbVL_0xSFUdZ2Fe8pXh00e=Mxw@mail.gmail.com
Whole thread
In response to Re: Implement waiting for wal lsn replay: reloaded  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
On Tue, Apr 21, 2026 at 2:46 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:

> The updated patchset is attached.  It includes improved coverage as
> suggested by Andres upthread.  And documentation that WAIT FOR LSN is
> timeline-blind (per off-list discussion with Xuneng).

I revised the test patch 6 to make the new cases check the intended
WAIT FOR behavior more directly, and to avoid cases where the test
could pass for the wrong reason.

The fresh walreceiver restart test now distinguishes what we can
observe from what is only covered indirectly.
'pg_last_wal_receive_lsn()' reports 'flushedUpto', not 'writtenUpto',
so the test now describes that state accurately and covers
'writtenUpto' through the 'standby_write' result. This seems
appropriate to me since the two positions are seeded in the places and
conditions. Test for flush lsn should also help verify write lsn.

The fencepost tests were split by the actual frontier being tested.
'standby_replay' uses 'pg_last_wal_replay_lsn()', while
'standby_flush' uses 'pg_last_wal_receive_lsn()'. This avoids treating
a replay-derived LSN as if it were also the exact write/flush
boundary. I left 'standby_write' out of the exact fencepost helper
because its frontier is not SQL-visible once walreceiver is stopped.
The async wakeup case now starts the waiter while replay is still
paused, so it must actually sleep before replay and walreceiver are
allowed to advance.

The cascading timeline-switch test now checks the 'WAIT FOR ...
NO_THROW' status from background psql stdout. The previous log-marker
pattern could pass after unexpected returned status, includingn
'timeout', because the following statement would still run. The
'received_tli > 1' check remains, but only as confirmation that the
downstream followed the new timeline; the 'success' status proves the
wait completed as intended.

Please check it.

--
Best,
Xuneng

Attachment

pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: FOR PORTION OF does not recompute GENERATED STORED columns that depend on the range column
Next
From: Michael Paquier
Date:
Subject: Re: Typo Fixes and Patch