Re: Improve pg_sync_replication_slots() to wait for primary to advance - Mailing list pgsql-hackers
From | Dilip Kumar |
---|---|
Subject | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
Date | |
Msg-id | CAFiTN-t9+yXo-HHN_mJLsu8A-OOORAmJpswyV9wKBzL9QiLqew@mail.gmail.com Whole thread Raw |
In response to | Re: Improve pg_sync_replication_slots() to wait for primary to advance (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Improve pg_sync_replication_slots() to wait for primary to advance
|
List | pgsql-hackers |
On Fri, Jul 18, 2025 at 10:45 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Jul 18, 2025 at 10:14 AM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Thu, Jul 17, 2025 at 9:34 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > On Wed, Jul 16, 2025 at 3:47 PM Ajin Cherian <itsajin@gmail.com> wrote: > > > > > > > > > I am not able to apply the patch to the latest head or even to a week > > > > > back version. Can you please check and rebase? > > > > > > > > > > thanks > > > > > Shveta > > > > > > > > Rebased. > > > > > > > > > > Thanks. Please find a few comments: > > > > > > > > > 1) > > > /* Any slot with NULL in these fields should not have made it this far */ > > > > > > It is good to get rid of the case where we had checks for NULL > > > confirmed_lsn and catalog_xmin (i.e. when slot was in RS_EPHEMERAL > > > state), as that has already been checked by synchronize_slots() and > > > such a slot will not even reach wait_for_primary_slot_catchup(). But a > > > slot can still be invalidated on primary anytime, and thus during this > > > wait, we should check for primary's invalidation as we were doing in > > > v1. > > > > > > 2) > > > + * If in SQL API synchronization, and we've been promoted, then no point > > > > > > extra space before promoted. > > > > > > 3) > > > > > > + if (!AmLogicalSlotSyncWorkerProcess() && PromoteIsTriggered()) > > > > > > We don't need 'AmLogicalSlotSyncWorkerProcess' as that is already > > > checked at the beginning of this function. > > > > > > 4) > > > + ereport(WARNING, > > > + errmsg("aborting sync for slot \"%s\"", > > > + remote_slot->name), > > > + errdetail("Promotion occurred before this slot was fully" > > > + " synchronized.")); > > > + pfree(cmd.data); > > > + > > > + return false; > > > > > > a) Please add an error-code. > > > > > > b) Shall we change msg to > > > > > > errmsg("aborting sync for slot \"%s\"", > > > remote_slot->name), > > > errhint("%s cannot be executed once promotion is > > > triggered.", > > > > > > "pg_sync_replication_slots()"))); > > > > > > > > > > > > 5) > > > Instead of using PromoteIsTriggered, shall we rely on > > > 'SlotSyncCtx->stopSignaled' as we do when we start this API. > > > > > > 6) > > > In logicaldecoding.sgml, we can get rid of "Additionally, enabling > > > sync_replication_slots on the standby is required" to make it same as > > > what we had prior to the patch I pointed earlier. > > > > > > Or better we can refine it to below. Thoughts? > > > > > > The logical replication slots on the primary can be enabled for > > > synchronization to the hot standby by using the failover parameter of > > > pg_create_logical_replication_slot, or by using the failover option of > > > CREATE SUBSCRIPTION during slot creation. After that, synchronization > > > can be performed either manually by calling pg_sync_replication_slots > > > on the standby, or automatically by enabling sync_replication_slots on > > > the standby. When sync_replication_slots is enabled, the failover > > > slots are periodically synchronized by the slot sync worker. For the > > > synchronization to work, ..... > > > > I am wondering if we should provide an optional parameter to > > pg_sync_replication_slots(), to control whether to wait for the slot > > to be synced or just return with ERROR as it is doing now, default can > > be wait. > > > > Do you mean specifically in case of promotion or in general, as in do > not wait for primary to catch-up (anytime) and exit and drop the > temporary slot while exiting? I am specifically pointing to the exposed function pg_sync_replication_slots() which was earlier non-blocking and was giving an error if the primary slot for not catch-up, so we have improved the functionality in this thread by making it wait for primary to catch-up instead of just throwing an error. But my question was since this is a user facing function so shall we keep the old behavior intact with some optional parameters so that if the user chooses not to wait they have options to do that? -- Regards, Dilip Kumar Google
pgsql-hackers by date: