Re: Improve pg_sync_replication_slots() to wait for primary to advance - Mailing list pgsql-hackers

From Ajin Cherian
Subject Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date
Msg-id CAFPTHDY19pyri1b9cectWH6NNRjWOcPa8YczqajbwXzJtGi_oQ@mail.gmail.com
Whole thread Raw
In response to Re: Improve pg_sync_replication_slots() to wait for primary to advance  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Improve pg_sync_replication_slots() to wait for primary to advance
Re: Improve pg_sync_replication_slots() to wait for primary to advance
List pgsql-hackers
On Tue, Dec 2, 2025 at 8:35 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Dec 2, 2025 at 1:58 PM Ajin Cherian <itsajin@gmail.com> wrote:
> >
> >
> > Attached patch v27 addresses the above comments.
> >
>
> Thanks for the patch. Please find a few comments:
>
> 1)
> + /* The worker pid must not be already assigned in SlotSyncCtx */
> + Assert(SlotSyncCtx->pid == InvalidPid);
> +
>
> We can mention just 'pid' here instead of 'worker pid'
>

Changed.

> 2)
> + /*
> + * The syscache access in fetch_or_refresh_remote_slots() needs a
> + * transaction env.
> + */
> fetch_or_refresh_remote_slots --> fetch_remote_slots
>
> 3)
>  SyncReplicationSlots(WalReceiverConn *wrconn)
>  {
> +
>
> We can get rid of this blank line at the start of the function.
>

Fixed.

> 4)
>  /*
>   * Shut down the slot sync worker.
>   *
> - * This function sends signal to shutdown slot sync worker, if required. It
> + * This function sends signal to shutdown the slot sync process, if
> required. It
>   * also waits till the slot sync worker has exited or
>   * pg_sync_replication_slots() has finished.
>   */
> Shall we change comment to something like (rephrase if required):
>
> Shut down the slot synchronization.
> This function wakes up the slot sync process (either worker or backend
> running SQL function) and sets stopSignaled=true
> so that worker can exit or SQL function pg_sync_replication_slots()
> can finish. It also waits till the slot sync worker has exited or
> pg_sync_replication_slots() has finished.
>

Changed.

> 5)
> We should change the comment atop 'SlotSyncCtxStruct' as well to
> mention that this pid is either the slot sync worker's pid or
> backend's pid running the SQL function. It is needed by the startup
> process to wake these up, so that they can stop synchronization on
> seeing stopSignaled.  <please rephrase as needed>
>

Changed.

> 6)
> + ereport(LOG,
> + errmsg("replication slot synchronization worker is shutting down on
> receiving SIGUSR1"));
>
> SIGUSR1 was actually just a wake-up signal. We may change the comment to:
> replication slot synchronization worker is shutting down as promotion
> is triggered.
>

Changed.

> 7)
> update_synced_slots_inactive_since:
> /* The slot sync worker or SQL function mustn't be running by now */
> - Assert((SlotSyncCtx->pid == InvalidPid) && !SlotSyncCtx->syncing);
> + Assert(!SlotSyncCtx->syncing);
>
> Regarding this, I see that 'update_synced_slots_inactive_since' is
> only called when we are sure that 'syncing' is false. So shouldn't pid
> be also Invalid by that time? Even if it was backend's pid to start
> with, but since backend has stopped syncing (finished or error-ed
> out),
> pid should be reset to Invalid in such a case. And this Assert need
> not to be changed.
>

Fixed.

> 8)
>
> + if (sync_process_pid != InvalidPid)
> + kill(sync_process_pid, SIGUSR1);
>
> We can write comments to say wake-up slot sync process.
>

Added comments.

Attaching patch v28 addressing these comments.

regards,
Ajin Cherian
Fujitsu Australia

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Returning nbtree posting list TIDs in DESC order during backwards scans
Next
From: shveta malik
Date:
Subject: Re: Skipping schema changes in publication