Re: Issue with logical replication slot during switchover - Mailing list pgsql-hackers

From Alexander Kukushkin
Subject Re: Issue with logical replication slot during switchover
Date
Msg-id CAFh8B=kH4Nwd_69fWP4VxK9tjxxBUzdxEZLd+LCby9tbSMTcRA@mail.gmail.com
Whole thread Raw
In response to Re: Issue with logical replication slot during switchover  (Fabrice Chapuis <fabrice636861@gmail.com>)
List pgsql-hackers
Hi,

On Fri, 31 Oct 2025 at 09:16, Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi, 
I indeed proposed a solution at the top of this thread to modify only the value of the synced attribute, but the discussion was redirected to adding an extra parameter to the function pg_create_logical_replication_slot() to overwrite a failover slot
We had discussed this point in another thread, please see [1]. After
discussion it was decided to not go this way.

[1]: https://www.postgresql.org/message-id/OS0PR01MB57161FF469DE049765DD53A89475A%40OS0PR01MB5716.jpnprd01.prod.outlook.com


I’ve read through the referenced discussion, and my impression is that we might be trying to design a solution around assumptions that are unlikely to hold in practice.
There was an argument that at some point we might allow creating logical failover slots on cascading standbys. However, if we consider all practical scenarios, it seems very unlikely that such a feature could work reliably with the current design.
Let me try to explain why.

Consider the following setup:
node1 - primary  
node2 - standby, replicating from node1  
node3 - standby, replicating from node1, has logical slot foo (failover=true, synced=false)  
node4 - standby, replicating from node3, has logical slot foo (failover=true, synced=true)

1) If node1 fails, we could promote either node2 or node3:
1.a) If we promote node2, we must first create a physical slot for node3, update primary_conninfo on node3 to point to node2, wait until node3 connects, and until catalog_xmin on the physical slot becomes non-NULL. Only then would it be safe to promote node2. This introduces unnecessary steps, complexity, and waiting — increasing downtime, which defeats the goal of high availability.
1.b) If we promote node3, promotion itself is fast, but subscribers will still be using the slot on the original primary. This again defeats the purpose of doing logical replication from a standby, and it won’t be possible to switch subscribers to node4 (see below).
2) If node3 fails, we might want to replace it with node4. But node4 has a slot with failover=true and synced=true, and synced=true prevents it from being used for streaming because it’s a standby.

In other words, with the current design, allowing creation of logical failover slots on standbys doesn’t bring any real benefit — such “synced” slots can’t actually be used later.

One could argue that we could add a function to switch synced=true->false on a standby, but that would just add another workaround on top of an already fragile design, increasing operational complexity without solving the underlying issue.

The same applies to proposals like allow_overwrite. If such a flag is introduced, in practice it will almost always be used unconditionally, e.g.:
SELECT pg_create_logical_replication_slot('<name>', '<plugin>', failover := true, allow_overwrite := true);

Right now, logical failover slots can’t be reused after a switchover, which is a perfectly normal operation.
The only current workaround is to detect standbys with failover=true, synced=false and drop those slots, hoping they’ll be resynchronized. But resynchronization is asynchronous, unpredictable, and may take an unbounded amount of time. If the primary fails during that window, there might be no standby with ready logical slots.

Instead of dropping such slots, what we actually need is a way to safely set synced=false->true and continue operating.

Operating logical replication setups is already extremely complex and error-prone — this is not theoretical, it’s something many of us face daily.
So rather than adding more speculative features or workarounds, I think we should focus on addressing real operational pain points and the inconsistencies in the current design.

A slot created on the primary (which later becomes a standby) with failover=true has a very clear purpose. The failover flag already indicates that purpose; synced shouldn’t override it.

Regards,
--
Alexander Kukushkin

pgsql-hackers by date:

Previous
From: Filip Janus
Date:
Subject: Re: Channel binding for post-quantum cryptography
Next
From: Peter Eisentraut
Date:
Subject: Re: Remaining dependency on setlocale()