Thread: pgsql: Implement pg_wal_replay_wait() stored procedure

pgsql: Implement pg_wal_replay_wait() stored procedure

From
Alexander Korotkov
Date:
Implement pg_wal_replay_wait() stored procedure

pg_wal_replay_wait() is to be used on standby and specifies waiting for
the specific WAL location to be replayed before starting the transaction.
This option is useful when the user makes some data changes on primary and
needs a guarantee to see these changes on standby.

The queue of waiters is stored in the shared memory array sorted by LSN.
During replay of WAL waiters whose LSNs are already replayed are deleted from
the shared memory array and woken up by setting of their latches.

pg_wal_replay_wait() needs to wait without any snapshot held.  Otherwise,
the snapshot could prevent the replay of WAL records implying a kind of
self-deadlock.  This is why it is only possible to implement
pg_wal_replay_wait() as a procedure working in a non-atomic context,
not a function.

Catversion is bumped.

Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru
Author: Kartyshov Ivan, Alexander Korotkov
Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila
Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/06c418e163e913966e17cb2d3fb1c5f8a8d58308

Modified Files
--------------
doc/src/sgml/func.sgml                          | 113 ++++++++
src/backend/access/transam/xlog.c               |   7 +
src/backend/access/transam/xlogrecovery.c       |  11 +
src/backend/catalog/system_functions.sql        |   3 +
src/backend/commands/Makefile                   |   3 +-
src/backend/commands/meson.build                |   1 +
src/backend/commands/waitlsn.c                  | 348 ++++++++++++++++++++++++
src/backend/storage/ipc/ipci.c                  |   7 +
src/backend/storage/lmgr/proc.c                 |   6 +
src/backend/utils/activity/wait_event_names.txt |   1 +
src/include/catalog/catversion.h                |   2 +-
src/include/catalog/pg_proc.dat                 |   5 +
src/include/commands/waitlsn.h                  |  43 +++
src/test/recovery/meson.build                   |   1 +
src/test/recovery/t/043_wal_replay_wait.pl      |  97 +++++++
src/tools/pgindent/typedefs.list                |   2 +
16 files changed, 648 insertions(+), 2 deletions(-)


Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From
Alexander Korotkov
Date:
On Tue, Apr 2, 2024 at 10:58 PM Alexander Korotkov
<akorotkov@postgresql.org> wrote:
> Implement pg_wal_replay_wait() stored procedure

I'm trying to figure out if this failure could be related to this commit...
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2024-04-02%2020%3A24%3A55

------
Regards,
Alexander Korotkov



Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From
Thomas Munro
Date:
On Wed, Apr 3, 2024 at 9:42 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> On Tue, Apr 2, 2024 at 10:58 PM Alexander Korotkov
> <akorotkov@postgresql.org> wrote:
> > Implement pg_wal_replay_wait() stored procedure
>
> I'm trying to figure out if this failure could be related to this commit...
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2024-04-02%2020%3A24%3A55

I think you might need to move WaitLSNShmemInit() down to
CreateOrAttachShmemStructs(), otherwise -DEXEC_BACKEND build dies with
a NULL pointer.  (Huh, why doesn't that animal show a backtrace?)



Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From
David Rowley
Date:
On Wed, 3 Apr 2024 at 09:42, Alexander Korotkov <aekorotkov@gmail.com> wrote:
> I'm trying to figure out if this failure could be related to this commit...
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2024-04-02%2020%3A24%3A55

Yeah, I think it is.

The problem is that in WaitLSNSetLatches() waitLSN is NULL, so
SpinLockAcquire(&waitLSN->mutex); segfaults.  This shared memory
segment is initialized in WaitLSNShmemInit() called via
CreateSharedMemoryAndSemaphores().  If you look at main() in main.c
EXEC_BACKEND builds call SubPostmasterMain(), which has no call to
CreateSharedMemoryAndSemaphores(), so waitLSN isn't initialized.

David



Re: pgsql: Implement pg_wal_replay_wait() stored procedure

From
Masahiko Sawada
Date:
Hi,

On Wed, Apr 3, 2024 at 4:58 AM Alexander Korotkov
<akorotkov@postgresql.org> wrote:
>
> Implement pg_wal_replay_wait() stored procedure
>
> pg_wal_replay_wait() is to be used on standby and specifies waiting for
> the specific WAL location to be replayed before starting the transaction.
> This option is useful when the user makes some data changes on primary and
> needs a guarantee to see these changes on standby.
>
> The queue of waiters is stored in the shared memory array sorted by LSN.
> During replay of WAL waiters whose LSNs are already replayed are deleted from
> the shared memory array and woken up by setting of their latches.
>
> pg_wal_replay_wait() needs to wait without any snapshot held.  Otherwise,
> the snapshot could prevent the replay of WAL records implying a kind of
> self-deadlock.  This is why it is only possible to implement
> pg_wal_replay_wait() as a procedure working in a non-atomic context,
> not a function.
>
> Catversion is bumped.
>

mamba complaints the build error[1]:

waitlsn.c: In function 'WaitForLSN':
waitlsn.c:275:24: error: 'endtime' may be used uninitialized in this
function [-Werror=maybe-uninitialized]
  275 |    delay_ms = (endtime - GetCurrentTimestamp()) / 1000;
      |               ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~

Regards,

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2024-04-03%2005%3A33%3A24

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com