Re: Replication slot is not able to sync up - Mailing list pgsql-hackers

From Peter Smith
Subject Re: Replication slot is not able to sync up
Date
Msg-id CAHut+Pv6V42xJ-00LSwR4a4PtU71ZTS-ewuDMANKQDz5WgmTcw@mail.gmail.com
Whole thread Raw
In response to Re: Replication slot is not able to sync up  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Replication slot is not able to sync up
List pgsql-hackers
On Wed, Jun 11, 2025 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 11, 2025 at 7:19 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Jun 10, 2025 at 3:20 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > >
> > > Thanks for updating the patch.
> > >
> > > I have few suggestions for the document from a user's perspective.
> > >
> >
> > Thanks Hou-San, I agree with your suggestions. Addressed in v4.
> >
> > Also addressed Amit's suggestion at [1] to improve errdetail.
> >
>
> So, the overall direction we are taking here is that we want to
> improve the existing LOG/DEBUG messages and docs for HEAD and back
> branches. Then we will improve the API behavior based on Hou-San's
> patch for PG19. Let me know if you or others think otherwise.
>
> +    <para>
> +     Apart from enabling <link linkend="guc-sync-replication-slots">
> +     <varname>sync_replication_slots</varname></link> to synchronize slots
> +     periodically, failover slots can be manually synchronized by invoking
> +     <link linkend="pg-sync-replication-slots">
> +     <function>pg_sync_replication_slots</function></link> on the standby.
> +     However, this function is primarily intended for testing and debugging
> +     purposes and should be used with caution. The recommended approach to
> +     synchronize slots is by enabling <link
> linkend="guc-sync-replication-slots">
> +     <varname>sync_replication_slots</varname></link> on the standby, as it
> +     ensures continuous and automatic synchronization of replication slots,
> +     facilitating seamless failover and high availability.
> +    </para>
> +
> +    <para>
> +     When slot-synchronization setup is done as recommended, and
> +     slot-synchronization is performed the very first time either automatically
> +     or by <link linkend="pg-sync-replication-slots">
> +     <function>pg_sync_replication_slots</function></link>,
> +     then for the synchronized slot to be created and persisted on the standby,
> +     one condition must be met. The logical replication slot on the primary
> +     must reach a state where the WALs and system catalog rows retained by
> +     the slot are also present on the corresponding standby server. This is
> +     needed to prevent any data loss and to allow logical replication
> to continue
> +     seamlessly through the synchronized slot if needed after promotion.
> +     If the WALs and system catalog rows retained by the slot on the
> primary have
> +     already been purged from the standby server, and synchronization
> is attempted
> +     for the first time, then to prevent the data loss as explained,
> persistence
> +     and synchronization of newly created slot will be skipped, and
> the following
> +     log message may appear on standby.
> +<programlisting>
> +     LOG:  could not synchronize replication slot "failover_slot"
> +     DETAIL:  Synchronization could lead to data loss as the remote
> slot needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby
> has LSN 0/3003F28 and catalog xmin 756
> +</programlisting>
> +     If the logical replication slot is actively consumed by a
> consumer, no further
> +     manual action is needed by the user, as the slot on primary will
> be advanced
> +     automatically, and synchronization will proceed in the next
> cycle. However,
> +     if no logical replication consumer is set up yet, to advance the slot, it
> +     is recommended to manually run the <link
> linkend="pg-logical-slot-get-changes">
> +     <function>pg_logical_slot_get_changes</function></link> or
> +     <link linkend="pg-logical-slot-get-binary-changes">
> +     <function>pg_logical_slot_get_binary_changes</function></link>
> on the primary
> +     slot and allow synchronization to proceed.
> +    </para>
> +
>
> I have reworded the above as follows:
> To enable periodic synchronization of replication slots, it is
> recommended to activate sync_replication_slots on the standby server.
> While manual synchronization is possible using
> pg_sync_replication_slots, this function is primarily intended for
> testing and debugging and should be used with caution. Automatic
> synchronization via sync_replication_slots ensures continuous slot
> updates, supporting seamless failover and maintaining high
> availability. When slot synchronization is configured as recommended,
> and the initial synchronization is performed either automatically or
> manually via pg_sync_replication_slot, the standby can persist the
> synchronized slot only if the following condition is met: The logical
> replication slot on the primary must retain WALs and system catalog
> rows that are still available on the standby. This ensures data
> integrity and allows logical replication to continue smoothly after
> promotion.
> If the required WALs or catalog rows have already been purged from the
> standby, the slot will not be persisted to avoid data loss. In such
> cases, the following log message may appear:
>
> LOG: could not synchronize replication slot "failover_slot"
> DETAIL: Synchronization could lead to data loss as the remote slot
> needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby has
> LSN 0/3003F28 and catalog xmin 756
>
> If the logical replication slot is actively used by a consumer, no
> manual intervention is needed; the slot will advance automatically,
> and synchronization will resume in the next cycle. However, if no
> consumer is configured, it is advisable to manually advance the slot
> on the primary using pg_logical_slot_get_changes or
> pg_logical_slot_get_binary_changes, allowing synchronization to
> proceed.
>
> Let me know what you think of above?
>

Phrases like "... it is recommended..." and "... intended for testing
and debugging .. " and "... should be used with caution." and "... it
is advisable to..." seem like indicators that parts of the above
description should be using SGML markup such as <caution> or <warning>
or <note> instead of just plain text.

======
Kind Regards,
Peter Smith.
Fujitsu Australia



pgsql-hackers by date:

Previous
From: Sami Imseih
Date:
Subject: Re: add function for creating/attaching hash table in DSM registry
Next
From: Tatsuo Ishii
Date:
Subject: Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options