Re: promotion related handling in pg_sync_replication_slots() - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: promotion related handling in pg_sync_replication_slots()
Date
Msg-id CAA4eK1LbZ4bgDFtG5HM=ye6mrHR4WTiAv1V7dY4PPod4F++5dA@mail.gmail.com
Whole thread Raw
In response to Re: promotion related handling in pg_sync_replication_slots()  (shveta malik <shveta.malik@gmail.com>)
Responses Re: promotion related handling in pg_sync_replication_slots()
List pgsql-hackers
On Fri, Apr 12, 2024 at 7:57 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > Please find v2. Changes are:
> >
> > Thanks for the patch. Here are some comments.
>
> Thanks for reviewing.
> >
> > 1. Can we have a clear saying in the shmem variable who's syncing at
> > the moment? Is it a slot sync worker or a backend via SQL function?
> > Perhaps turn "bool syncing;" to "SlotSyncSource sync_source;"
> >
> > typedef enum SlotSyncSource
> > {
> >     SLOT_SYNC_NONE,
> >     SLOT_SYNC_WORKER,
> >     SLOT_SYNC_BACKEND,
> > } SlotSyncSource;
> >
> > Then, the check in ShutDownSlotSync can be:
> >
> > +       /*
> > +        * Return if neither the slot sync worker is running nor the function
> > +        * pg_sync_replication_slots() is executing.
> > +        */
> > +       if ((SlotSyncCtx->pid == InvalidPid) &&
> > SlotSyncCtx->sync_source != SLOT_SYNC_BACKEND)
> >         {
> >

I don't know if this will be help, especially after fixing the race
condition I mentioned. But otherwise, also, at this stage it doesn't
seem helpful to add the source of sync explicitly.

> > 2.
> > SyncReplicationSlots(WalReceiverConn *wrconn)
> >  {
> > +    /*
> > +     * Startup process signaled the slot sync to stop, so if meanwhile user
> > +     * has invoked slot sync SQL function, simply return.
> > +     */
> > +    SpinLockAcquire(&SlotSyncCtx->mutex);
> > +    if (SlotSyncCtx->stopSignaled)
> > +    {
> > +        ereport(LOG,
> > +                errmsg("skipping slot synchronization as slot sync
> > shutdown is signaled during promotion"));
> > +
> >
> > Unless I'm missing something, I think this can't detect if the backend
> > via SQL function is already half-way through syncing in
> > synchronize_one_slot. So, better move this check to (or also have it
> > there) slot sync loop that calls synchronize_one_slot. To avoid
> > spinlock acquisitions, we can perhaps do this check in when we acquire
> > the spinlock for synced flag.
>
> If the sync via SQL function is already half-way, then promotion
> should wait for it to finish. I don't think it is a good idea to move
> the check to synchronize_one_slot().  The sync-call should either not
> start (if it noticed the promotion) or finish the sync and then let
> promotion proceed. But I would like to know others' opinion on this.
>

Agreed.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: promotion related handling in pg_sync_replication_slots()
Next
From: Etsuro Fujita
Date:
Subject: Re: Another WaitEventSet resource leakage in back branches