Re: Issue with logical replication slot during switchover - Mailing list pgsql-hackers

From shveta malik
Subject Re: Issue with logical replication slot during switchover
Date
Msg-id CAJpy0uCZ-Z9Km-fGjXm9C90DoiF_EFe2SbCh9Aw7vnF-9K+J_A@mail.gmail.com
Whole thread Raw
In response to Re: Issue with logical replication slot during switchover  (Fabrice Chapuis <fabrice636861@gmail.com>)
Responses Re: Issue with logical replication slot during switchover
List pgsql-hackers
On Fri, Aug 8, 2025 at 7:01 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
>
> Thanks Shveta for coming on this point again and fixing the link.
> The idea is to check if the slot has same name to try to resynchronize it with the primary.
> ok the check on the failover status for the remote slot is perhaps redundant.
> I'm not sure what impact setting the synced flag to true might have. But if you run an additional switchover, it
worksfine because the synced flag on the new primary is set to true now. 
> If we come back to the idea of the GUC or the API, adding an allow_overwrite parameter to the
pg_create_logical_replication_slotfunction and removing the logical slot when set to true could be a suitable approach. 
>
> What is your opinion?
>

If implemented as a GUC, it would address only a specific corner case,
making it less suitable to be added as a GUC.

OTOH, adding it as a slot's property makes more sense. You can start
with introducing a new slot property, allow_overwrite. By default,
this property will be set to false.

a) The function pg_create_logical_replication_slot() can be extended
to accept this parameter.
b) A new API pg_alter_logical_replication_slot() can be introduced, to
modify this property after slot creation if needed.
c) The commands CREATE SUBSCRIPTION and ALTER SUBSCRIPTION are not
needed to include an allow_overwrite parameter.  When CREATE
SUBSCRIPTION creates a slot, it will always set allow_overwrite to
false by default. If users need to change this later, they can use the
new API pg_alter_logical_replication_slot() to update the property.
d) Additionally, pg_alter_logical_replication_slot() can serve as a
generic API to modify other slot properties as well.

This appears to be a reasonable idea with potential use cases beyond
just allowing synchronization post switchover. Thoughts?

~~~

Another problem as you pointed out is inconsistent behaviour across
switchovers. On the first switchover, we get the error on new standby:
 "Exiting from slot synchronization because a slot with the same name
already exists on the standby."

But in the case of a double switchover, this error does not occur.
This is due to the 'synced' flag not set on new standby on first
switchover while set in double switchover. I think the behaviour
should be the same. In both cases, it should emit the same error. We
are thinking of a potential solution here and will start a new thread
if needed.

thanks
Shveta



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Parallel Apply
Next
From: Amit Kapila
Date:
Subject: Re: PG 18 release notes draft committed