Thread: Logical Replication slot disappeared after promote Standby
max_wal_senders = '10'
max_replication_slots = '10'
wal_keep_size = '512MB'
hot_standby = 'on'
sync_replication_slots = 'on'
hot_standby_feedback = 'on'
synchronized_standby_slots = 'Kafka_logical_slot'
----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+------------------------------
stand-by | kafka_logical_slot | logical | t | t | pgoutput | replica_test | t | t | 0/6C000000 | | 2025-06-13 00:43:15.61492+00
----------+--------------------+-----------+-----------+--------+----------+---------------+----------+--------+-------------+---------------------+-------------------------------
stand-by | kafka_logical_slot | logical | f | f | pgoutput | replica_test | t | t | 0/6D000060 | 0/6D000098 | 2025-06-13 00:45:11.547671+00
Thanks Hou zjI will capture log message and share it with you,temporary column marked as 'false' every where , But synced column marked as 'false' in Primary whereas it was 'true' in both Direct STANDBYs#3 : No , I didn't drop slot at Primary. Infact Secondary B ( Another Direct Standby) still showing slot.On Thu, Jun 12, 2025 at 1:44 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote:On Thu, Jun 12, 2025 at 4:08 PM Perumal Raj wrote:
> Hi Community,
>
> I have installed postgres version 17.5 with following setup,
>
> Primary
> -- Secondary A
> -- Secondary B
> -- Secondary C
>
> Config:
> wal_level = 'logical'
> max_wal_senders = '10'
> max_replication_slots = '10'
> wal_keep_size = '512MB'
> hot_standby = 'on'
> sync_replication_slots = 'on'
> hot_standby_feedback = 'on'
> synchronized_standby_slots = 'Kafka_logical_slot'
>
>
> 1. slotsync worker is running all the time ( Automatic sync)
> 2. When I create logical replication slot(Kafka_logical_slot) in Primary, it
> got synced on both Secondary A and > Secondary B
> 3. It didn't appear in Secondary C , Since its not direct replica.
>
> Issue : When I stop Primary node and promote one of the Direct secondary
> (A,B) node. logical replication slot is vanished.
>
> Am I missing any configuration ?
>
> Please share your experience.
Thanks for reporting.
To narrow down potential causes, please confirm the following:
1) One possibility is that the slot has not been successfully synchronized to
the standby. To verify, check for the presence of the following log message:
LOG: newly created replication slot "your_slot" is sync-ready now
If this message is absent, it indicates that the slot has not been successfully
synced. Additionally, you can confirm the sync status by inspecting the
pg_replication_slots.temporary field on the standby; a value of true suggests
that the slot sync has not completed.
2) We typically recommend specifying the primary_slot_name on the standby to
prevent slot invalidation due to catalog row removal on the primary. Please
check your logs for possible invalidation messages:
LOG: invalidating obsolete replication slot "your_slot"
or
LOG: terminating process 12344 to release replication slot "your_slot"
3) Is there a chance that the slot was dropped on the primary before stopping
it and promoting the standby? If so, the synced slot would also be dropped
in this scenario.
Best Regards,
Hou zj
On Fri, Jun 13, 2025 at 1:00 PM Perumal Raj <perucinci@gmail.com> wrote: > > Yes Shveta! > > I could see repeated message in New-replica . > > 2025-06-13 06:20:30.146 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:20:30.146 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:21:00.176 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:21:00.176 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:21:30.207 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:21:30.207 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:22:00.238 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:22:00.238 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:22:30.268 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:22:30.268 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:23:00.299 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:23:00.299 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:23:30.329 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:23:30.329 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:24:00.360 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:24:00.360 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:24:30.391 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:24:30.391 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:25:00.421 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:25:00.421 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > 2025-06-13 06:25:30.452 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slotprecedes local slot > 2025-06-13 06:25:30.452 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slothas LSN 0/6F000000 and catalog xmin 1088. > > > It appears that my Debezium connectors have stopped consuming data, resulting in an outdated restart_lsn of "0/6D0000B8". > Yes, if there are no consumers consuming the changes on the failover slot on primary, and meanwhile slot synchronization is started, the initial sync may have such a temporary state of synced slot. This is intentionally done to prevent the inconsistent state of the synced slot and avoid unexpected behaviour if failover is performed at that moment. > In contrast, the New_replica has a restart_lsn that matches the primary server's most recent confirmed_flush_lsn, indicatingit is up to date. > > As soon as I recreate that replication slot, it got sync with New_Replica(temporary=false) . > > 2025-06-13 06:26:00.484 UTC [277861] LOG: dropped replication slot "kafka_logical_slot" of database with OID 16384 > > 2025-06-13 06:26:30.520 UTC [277861] LOG: starting logical decoding for slot "kafka_logical_slot" > > 2025-06-13 06:26:30.520 UTC [277861] DETAIL: Streaming transactions committing after 0/0, reading WAL from 0/76003140. > > 2025-06-13 06:26:30.520 UTC [277861] LOG: logical decoding found consistent point at 0/76003140 > > 2025-06-13 06:26:30.520 UTC [277861] DETAIL: There are no running transactions. > > 2025-06-13 06:26:30.526 UTC [277861] LOG: newly created replication slot "kafka_logical_slot" is sync-ready now > > 2025-06-13 06:35:39.212 UTC [277857] LOG: restartpoint starting: time > > 2025-06-13 06:35:42.022 UTC [277857] LOG: restartpoint complete: wrote 29 buffers (0.2%); 0 WAL file(s) added, 0 removed,0 recycled; write=2.805 s, sync=0.002 s, total=2.810 s; sync files=26, longest=0.002 s, average=0.001 s; distance=16496kB, estimate=16496 kB; lsn=0/7701F480, redo lsn=0/7701F428 > > 2025-06-13 06:35:42.022 UTC [277857] LOG: recovery restart point at 0/7701F428 > > 2025-06-13 06:35:42.022 UTC [277857] DETAIL: Last completed transaction was at log time 2025-06-13 06:33:31.675341+00. > > Until the synchronization is complete, the slot type is marked as temporary=true, as you mentioned. > > is there any manual way to advance "restart_lsn" of logical replication slot ? This is to ensure slot synchronization. > 1) The first and recommended option is to get the connector running again and let it advance the slot by consuming the changes. 2) Another option is to manually advance the slot on the primary by using pg_logical_slot_get_binary_changes(). However, if the logical replication setup is intended to consume these changes but is currently inactive, then slot's consumer will not be able to reprocess those changes upon restarting. So the said API should be used only after analyzing the current state of logical replication setup and if we are okay with those changes not shipped to logical replication consumers. thanks Shveta
Prerequisites for Setting Up a Logical Replication Slot sync in >= pg17
To successfully configure a logical replication slot, ensure the following settings are applied:
wal_level = 'logical' hot_standby = 'on' hot_standby_feedback = 'on' sync_replication_slots = 'on'
Replication Slot Synchronization
Logical replication slots can synchronize with all direct standby servers of the primary but are not compatible with cascade standby servers.
Temporary Status of New Standby Slots
If a new standby server is created after the logical replication slot, it will be marked as temporary=true until the reset_lsn of the primary matches the confirmed_lsn of the new standby.
Limitations on Using Logical Replication Slots
While logical replication slots can synchronize on the direct standby side, they cannot be utilized (as in the case of Debezium) until the standby server is promoted to primary. Attempting to use a synchronized logical slot on a standby server will result in the following error:
org.postgresql.util.PSQLException: ERROR: cannot use replication slot "kafka_logical_slot" for logical decoding Detail: This replication slot is being synchronized from the primary server.
replica_test=# SELECT * FROM pg_logical_slot_get_changes('kafka_logical_slot', NULL, NULL);
ERROR: option "proto_version" missing
CONTEXT: slot "kafka_logical_slot", output plugin "pgoutput", in the startup callback
Next, we can create a logical replication slot:
replica_test=# SELECT pg_create_logical_replication_slot('test', 'test_decoding', false, true, true);
pg_create_logical_replication_slot
------------------------------------
(test, 0/7B001AA0)
Now, let's attempt to retrieve changes from the new slot:
replica_test=# SELECT * FROM pg_logical_slot_get_changes('test', NULL, NULL);
WARNING: cannot specify logical replication slot "kafka_logical_slot" in parameter "synchronized_standby_slots"
DETAIL: Logical replication is waiting for correction on replication slot "kafka_logical_slot".
HINT: Remove the logical replication slot "kafka_logical_slot" from parameter "synchronized_standby_slots".
To resolve this, we will alter the system settings:
replica_test=# ALTER SYSTEM SET synchronized_standby_slots = '';
Finally, we can check for changes again:
replica_test=# SELECT * FROM pg_logical_slot_get_changes('test', NULL, NULL);
lsn | xid | data
-------------+------+---------------------------------------------- 0/7B001AA0 | 1218 | BEGIN 1218 0/7B00B9D0 | 1218 | table public.customers_1: TRUNCATE: (no-flags) 0/7B00BB70 | 1218 | COMMIT 1218
Thanks Shveta,
Zhijie Hou
Please correct me if needed.
On Fri, Jun 13, 2025 at 1:00 PM Perumal Raj <perucinci@gmail.com> wrote:
>
> Yes Shveta!
>
> I could see repeated message in New-replica .
>
> 2025-06-13 06:20:30.146 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:20:30.146 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:21:00.176 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:21:00.176 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:21:30.207 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:21:30.207 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:22:00.238 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:22:00.238 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:22:30.268 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:22:30.268 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:23:00.299 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:23:00.299 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:23:30.329 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:23:30.329 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:24:00.360 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:24:00.360 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:24:30.391 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:24:30.391 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:25:00.421 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:25:00.421 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
> 2025-06-13 06:25:30.452 UTC [277861] LOG: could not synchronize replication slot "kafka_logical_slot" because remote slot precedes local slot
> 2025-06-13 06:25:30.452 UTC [277861] DETAIL: The remote slot has LSN 0/6D0000B8 and catalog xmin 1085, but the local slot has LSN 0/6F000000 and catalog xmin 1088.
>
>
> It appears that my Debezium connectors have stopped consuming data, resulting in an outdated restart_lsn of "0/6D0000B8".
>
Yes, if there are no consumers consuming the changes on the failover
slot on primary, and meanwhile slot synchronization is started, the
initial sync may have such a temporary state of synced slot. This is
intentionally done to prevent the inconsistent state of the synced
slot and avoid unexpected behaviour if failover is performed at that
moment.
> In contrast, the New_replica has a restart_lsn that matches the primary server's most recent confirmed_flush_lsn, indicating it is up to date.
>
> As soon as I recreate that replication slot, it got sync with New_Replica(temporary=false) .
>
> 2025-06-13 06:26:00.484 UTC [277861] LOG: dropped replication slot "kafka_logical_slot" of database with OID 16384
>
> 2025-06-13 06:26:30.520 UTC [277861] LOG: starting logical decoding for slot "kafka_logical_slot"
>
> 2025-06-13 06:26:30.520 UTC [277861] DETAIL: Streaming transactions committing after 0/0, reading WAL from 0/76003140.
>
> 2025-06-13 06:26:30.520 UTC [277861] LOG: logical decoding found consistent point at 0/76003140
>
> 2025-06-13 06:26:30.520 UTC [277861] DETAIL: There are no running transactions.
>
> 2025-06-13 06:26:30.526 UTC [277861] LOG: newly created replication slot "kafka_logical_slot" is sync-ready now
>
> 2025-06-13 06:35:39.212 UTC [277857] LOG: restartpoint starting: time
>
> 2025-06-13 06:35:42.022 UTC [277857] LOG: restartpoint complete: wrote 29 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=2.805 s, sync=0.002 s, total=2.810 s; sync files=26, longest=0.002 s, average=0.001 s; distance=16496 kB, estimate=16496 kB; lsn=0/7701F480, redo lsn=0/7701F428
>
> 2025-06-13 06:35:42.022 UTC [277857] LOG: recovery restart point at 0/7701F428
>
> 2025-06-13 06:35:42.022 UTC [277857] DETAIL: Last completed transaction was at log time 2025-06-13 06:33:31.675341+00.
>
> Until the synchronization is complete, the slot type is marked as temporary=true, as you mentioned.
>
> is there any manual way to advance "restart_lsn" of logical replication slot ? This is to ensure slot synchronization.
>
1) The first and recommended option is to get the connector running
again and let it advance the slot by consuming the changes.
2) Another option is to manually advance the slot on the primary by
using pg_logical_slot_get_binary_changes(). However, if the logical
replication setup is intended to consume these changes but is
currently inactive, then slot's consumer will not be able to reprocess
those changes upon restarting. So the said API should be used only
after analyzing the current state of logical replication setup and if
we are okay with those changes not shipped to logical replication
consumers.
thanks
Shveta
On Fri, Jun 13, 2025 at 10:52 PM Perumal Raj <perucinci@gmail.com> wrote: > > Thanks for explanation Shveta! > > ------------ > As Summary in this original thread, > > Prerequisites for Setting Up a Logical Replication Slot sync in >= pg17 > > To successfully configure a logical replication slot, ensure the following settings are applied: > > wal_level = 'logical' > hot_standby = 'on' > hot_standby_feedback = 'on' > sync_replication_slots = 'on' > Additionally, you need to configure primary_slot_name on the standby and have dbname in primary_conninfo. For further details, you can refer docs (1)(2). > Replication Slot Synchronization > > Logical replication slots can synchronize with all direct standby servers of the primary but are not compatible with cascadestandby servers. > > Temporary Status of New Standby Slots > > If a new standby server is created after the logical replication slot, it will be marked as temporary=true until the reset_lsnof the primary matches the confirmed_lsn of the new standby. > It is restart_lsn on both nodes, but there are other things like slot's catalog_xmin as well. As a user, you need to ensure that your primary's logical slot is being consumed. And this is required primarily at the initial sync time so that we sync the slot only if the standby has required resources like WAL to allow decoding from the synced slot after failover. > Limitations on Using Logical Replication Slots > > While logical replication slots can synchronize on the direct standby side, they cannot be utilized (as in the case ofDebezium) until the standby server is promoted to primary. Attempting to use a synchronized logical slot on a standby serverwill result in the following error: > > org.postgresql.util.PSQLException: ERROR: cannot use replication slot "kafka_logical_slot" for logical decoding > Detail: This replication slot is being synchronized from the primary server. > I don't think we can call this a limitation. According to me, this is a requirement for this feature to work. Consider if we allow the use of this synced slot for decoding when sync is still in-progress, this slot could be advanced ahead of the primary. Now, after the failover, we won't be able to reuse this slot to allow the subscribers to continue replication. (1) - https://www.postgresql.org/docs/devel/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS-SYNCHRONIZATION (2) - https://www.postgresql.org/docs/devel/logical-replication-failover.html -- With Regards, Amit Kapila.
On Fri, Jun 13, 2025 at 10:52 PM Perumal Raj <perucinci@gmail.com> wrote:
>
> Thanks for explanation Shveta!
>
> ------------
> As Summary in this original thread,
>
> Prerequisites for Setting Up a Logical Replication Slot sync in >= pg17
>
> To successfully configure a logical replication slot, ensure the following settings are applied:
>
> wal_level = 'logical'
> hot_standby = 'on'
> hot_standby_feedback = 'on'
> sync_replication_slots = 'on'
>
Additionally, you need to configure primary_slot_name on the standby
and have dbname in primary_conninfo. For further details, you can
refer docs (1)(2).
> Replication Slot Synchronization
>
> Logical replication slots can synchronize with all direct standby servers of the primary but are not compatible with cascade standby servers.
>
> Temporary Status of New Standby Slots
>
> If a new standby server is created after the logical replication slot, it will be marked as temporary=true until the reset_lsn of the primary matches the confirmed_lsn of the new standby.
>
It is restart_lsn on both nodes, but there are other things like
slot's catalog_xmin as well. As a user, you need to ensure that your
primary's logical slot is being consumed. And this is required
primarily at the initial sync time so that we sync the slot only if
the standby has required resources like WAL to allow decoding from the
synced slot after failover.
> Limitations on Using Logical Replication Slots
>
> While logical replication slots can synchronize on the direct standby side, they cannot be utilized (as in the case of Debezium) until the standby server is promoted to primary. Attempting to use a synchronized logical slot on a standby server will result in the following error:
>
> org.postgresql.util.PSQLException: ERROR: cannot use replication slot "kafka_logical_slot" for logical decoding
> Detail: This replication slot is being synchronized from the primary server.
>
I don't think we can call this a limitation. According to me, this is
a requirement for this feature to work. Consider if we allow the use
of this synced slot for decoding when sync is still in-progress, this
slot could be advanced ahead of the primary. Now, after the failover,
we won't be able to reuse this slot to allow the subscribers to
continue replication.
(1) - https://www.postgresql.org/docs/devel/logicaldecoding-explanation.html#LOGICALDECODING-REPLICATION-SLOTS-SYNCHRONIZATION
(2) - https://www.postgresql.org/docs/devel/logical-replication-failover.html
--
With Regards,
Amit Kapila.