Re: [HACKERS] make async slave to wait for lsn to be replayed - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] make async slave to wait for lsn to be replayed
Date
Msg-id CAPpHfdsZ8hJiYaRxTbnKqVee_Nmbb+PVd+QtWifoBYkX71f0dA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] make async slave to wait for lsn to be replayed  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] make async slave to wait for lsn to be replayed
Re: [HACKERS] make async slave to wait for lsn to be replayed
List pgsql-hackers
Hi!

I've decided to put my hands on this patch.

On Thu, Mar 7, 2024 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> +1 for the second one not only because it avoids new words in grammar
> but also sounds to convey the meaning. I think you can explain in docs
> how this feature can be used basically how will one get the correct
> LSN value to specify.

I picked the second option and left only the AFTER clause for the
BEGIN statement.  I think this should be enough for the beginning.

> As suggested previously also pick one of the approaches (I would
> advocate the second one) and keep an option for the second one by
> mentioning it in the commit message. I hope to see more
> reviews/discussions or usage like how will users get the LSN value to
> be specified on the core logic of the feature at this stage. IF
> possible, state, how real-world applications could leverage this
> feature.

I've added a paragraph to the docs about the usage.  After you made
some changes on primary, you run pg_current_wal_insert_lsn().  Then
connect to replica and run 'BEGIN AFTER lsn' with the just obtained
LSN.  Now you're guaranteed to see the changes made to the primary.

Also, I've significantly reworked other aspects of the patch.  The
most significant changes are:
1) Waiters are now stored in the array sorted by LSN.  This saves us
from scanning of wholeper-backend array.
2) Waiters are removed from the array immediately once their LSNs are
replayed.  Otherwise, the WAL replayer will keep scanning the shared
memory array till waiters wake up.
3) To clean up after errors, we now call WaitLSNCleanup() on backend
shmem exit.  I think this is preferable over the previous approach to
remove from the queue before ProcessInterrupts().
4) There is now condition to recheck if LSN is replayed after adding
to the shared memory array.  This should save from the race
conditions.
5) I've renamed too generic names for functions and files.

------
Regards,
Alexander Korotkov

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Next
From: shveta malik
Date:
Subject: Re: Regardign RecentFlushPtr in WalSndWaitForWal()