Re: failover logical replication slots - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: failover logical replication slots
Date
Msg-id CAA4eK1L8u0wWWYLj3STbiiCAesKNSmPX8tKSs88Zg_ajMVjyxg@mail.gmail.com
Whole thread Raw
In response to Re: failover logical replication slots  (Fabrice Chapuis <fabrice636861@gmail.com>)
List pgsql-hackers
On Fri, Jul 11, 2025 at 8:42 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote:
>
> Hi Amit,
> Here is a proposed solution to handle the problem of creating the logical replication slot on standby after a
switchover.
> Thank you for your comments and help on this issue
>
> Regards
>
> Fabrice
>
> diff --git a/src/backend/replication/logical/slotsync.c b/src/backend/replication/logical/slotsync.c
> index 656e66e..296840a 100644
> --- a/src/backend/replication/logical/slotsync.c
> +++ b/src/backend/replication/logical/slotsync.c
> @@ -627,6 +627,7 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
>         ReplicationSlot *slot;
>         XLogRecPtr      latestFlushPtr;
>         bool            slot_updated = false;
> +       bool            overwriting_failover_slot = true; /* could be a GUC */
>
>         /*
>          * Make sure that concerned WAL is received and flushed before syncing
> @@ -654,19 +655,37 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid)
>         if ((slot = SearchNamedReplicationSlot(remote_slot->name, true)))
>         {
>                 bool            synced;
> +               bool            failover_status = remote_slot->failover;;
>
>                 SpinLockAcquire(&slot->mutex);
>                 synced = slot->data.synced;
>                 SpinLockRelease(&slot->mutex);
>
> -               /* User-created slot with the same name exists, raise ERROR. */
> -               if (!synced)
> -                       ereport(ERROR,
> +               if (!synced){
> +                       /*
> +                        * Check if we need to overwrite an existing failover slot and
> +                        * if slot has the failover flag set to true
> +                        * and the sync_replication_slots is on,
> +                        * other check could be added here */
> +                       if (overwriting_failover_slot && failover_status && sync_replication_slots){
> +

I think we don't need to explicitly check sync_replication_slots, as
we should reach here only when that flag is set. I think we should
introduce an pg_alter_replication_slot which allows to overwrite
existing slots during sync by setting a parameter like allow_overwrite
(or something like that). This API will be useful for other purposes,
like changing two_phase or failover properties of the slot after the
creation of the slot. BTW, we also discussed supporting
pg_drop_all_slots kind of API as well. See if you are interested in
implementing that API as well.

Note: I suggest starting a new thread with the concrete proposal for
the new API or GUC, stating how it will be helpful. It might help in
getting suggestions from others as well.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Report bytes and transactions actually sent downtream
Next
From: Konstantin Knizhnik
Date:
Subject: Re: Logical replication prefetch