Re: Improve pg_sync_replication_slots() to wait for primary to advance - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Improve pg_sync_replication_slots() to wait for primary to advance
Date
Msg-id CAFiTN-t9+yXo-HHN_mJLsu8A-OOORAmJpswyV9wKBzL9QiLqew@mail.gmail.com
Whole thread Raw
In response to Re: Improve pg_sync_replication_slots() to wait for primary to advance  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Improve pg_sync_replication_slots() to wait for primary to advance
List pgsql-hackers
On Fri, Jul 18, 2025 at 10:45 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jul 18, 2025 at 10:14 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Thu, Jul 17, 2025 at 9:34 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Jul 16, 2025 at 3:47 PM Ajin Cherian <itsajin@gmail.com> wrote:
> > > >
> > > > > I am not able to apply the patch to the latest head or even to a week
> > > > > back version. Can you please check and rebase?
> > > > >
> > > > > thanks
> > > > > Shveta
> > > >
> > > > Rebased.
> > > >
> > >
> > > Thanks. Please find a few comments:
> > >
> > >
> > > 1)
> > > /* Any slot with NULL in these fields should not have made it this far */
> > >
> > > It is good to get rid of the case where we had checks for NULL
> > > confirmed_lsn and catalog_xmin (i.e. when slot was in RS_EPHEMERAL
> > > state), as that has already been checked by synchronize_slots() and
> > > such a slot will not even reach wait_for_primary_slot_catchup(). But a
> > > slot can still be invalidated on primary anytime, and thus during this
> > > wait, we should check for primary's invalidation as we were doing in
> > > v1.
> > >
> > > 2)
> > > + * If in SQL API synchronization, and we've been  promoted, then no point
> > >
> > >  extra space before promoted.
> > >
> > > 3)
> > >
> > > + if (!AmLogicalSlotSyncWorkerProcess() && PromoteIsTriggered())
> > >
> > > We don't need 'AmLogicalSlotSyncWorkerProcess' as that is already
> > > checked at the beginning of this function.
> > >
> > > 4)
> > > + ereport(WARNING,
> > > + errmsg("aborting sync for slot \"%s\"",
> > > + remote_slot->name),
> > > + errdetail("Promotion occurred before this slot was fully"
> > > +   " synchronized."));
> > > + pfree(cmd.data);
> > > +
> > > + return false;
> > >
> > > a) Please add an error-code.
> > >
> > > b) Shall we change msg to
> > >
> > > errmsg("aborting sync for slot \"%s\"",
> > > remote_slot->name),
> > >                     errhint("%s cannot be executed once promotion is
> > > triggered.",
> > >
> > > "pg_sync_replication_slots()")));
> > >
> > >
> > >
> > > 5)
> > > Instead of using PromoteIsTriggered, shall we rely on
> > > 'SlotSyncCtx->stopSignaled' as we do when we start this API.
> > >
> > > 6)
> > > In logicaldecoding.sgml, we can get rid of "Additionally, enabling
> > > sync_replication_slots on the standby is required" to make it same as
> > > what we had prior to the patch I pointed earlier.
> > >
> > > Or better we can refine it to below. Thoughts?
> > >
> > > The logical replication slots on the primary can be enabled for
> > > synchronization to the hot standby by using the failover parameter of
> > > pg_create_logical_replication_slot, or by using the failover option of
> > > CREATE SUBSCRIPTION during slot creation. After that, synchronization
> > > can be performed either manually by calling pg_sync_replication_slots
> > > on the standby, or automatically by enabling sync_replication_slots on
> > > the standby. When sync_replication_slots is enabled, the failover
> > > slots are periodically synchronized by the slot sync worker. For the
> > > synchronization to work, .....
> >
> > I am wondering if we should provide an optional parameter to
> > pg_sync_replication_slots(), to control whether to wait for the slot
> > to be synced or just return with ERROR as it is doing now, default can
> > be wait.
> >
>
> Do you mean specifically in case of promotion or in general, as in do
> not wait for primary to catch-up (anytime) and exit and drop the
> temporary slot while exiting?

I am specifically pointing to the exposed function
pg_sync_replication_slots() which was earlier non-blocking and was
giving an error if the primary slot for not catch-up, so we have
improved the functionality in this thread by making it wait for
primary to catch-up instead of just throwing an error.  But my
question was since this is a user facing function so shall we keep the
old behavior intact with some optional parameters so that if the user
chooses not to wait they have options to do that?

--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance
Next
From: shveta malik
Date:
Subject: Re: Improve pg_sync_replication_slots() to wait for primary to advance