Re: Replication slot is not able to sync up - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Replication slot is not able to sync up |
Date | |
Msg-id | CAHut+Pv6V42xJ-00LSwR4a4PtU71ZTS-ewuDMANKQDz5WgmTcw@mail.gmail.com Whole thread Raw |
In response to | Re: Replication slot is not able to sync up (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Replication slot is not able to sync up
|
List | pgsql-hackers |
On Wed, Jun 11, 2025 at 8:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 11, 2025 at 7:19 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Jun 10, 2025 at 3:20 PM Zhijie Hou (Fujitsu) > > <houzj.fnst@fujitsu.com> wrote: > > > > > > > > > Thanks for updating the patch. > > > > > > I have few suggestions for the document from a user's perspective. > > > > > > > Thanks Hou-San, I agree with your suggestions. Addressed in v4. > > > > Also addressed Amit's suggestion at [1] to improve errdetail. > > > > So, the overall direction we are taking here is that we want to > improve the existing LOG/DEBUG messages and docs for HEAD and back > branches. Then we will improve the API behavior based on Hou-San's > patch for PG19. Let me know if you or others think otherwise. > > + <para> > + Apart from enabling <link linkend="guc-sync-replication-slots"> > + <varname>sync_replication_slots</varname></link> to synchronize slots > + periodically, failover slots can be manually synchronized by invoking > + <link linkend="pg-sync-replication-slots"> > + <function>pg_sync_replication_slots</function></link> on the standby. > + However, this function is primarily intended for testing and debugging > + purposes and should be used with caution. The recommended approach to > + synchronize slots is by enabling <link > linkend="guc-sync-replication-slots"> > + <varname>sync_replication_slots</varname></link> on the standby, as it > + ensures continuous and automatic synchronization of replication slots, > + facilitating seamless failover and high availability. > + </para> > + > + <para> > + When slot-synchronization setup is done as recommended, and > + slot-synchronization is performed the very first time either automatically > + or by <link linkend="pg-sync-replication-slots"> > + <function>pg_sync_replication_slots</function></link>, > + then for the synchronized slot to be created and persisted on the standby, > + one condition must be met. The logical replication slot on the primary > + must reach a state where the WALs and system catalog rows retained by > + the slot are also present on the corresponding standby server. This is > + needed to prevent any data loss and to allow logical replication > to continue > + seamlessly through the synchronized slot if needed after promotion. > + If the WALs and system catalog rows retained by the slot on the > primary have > + already been purged from the standby server, and synchronization > is attempted > + for the first time, then to prevent the data loss as explained, > persistence > + and synchronization of newly created slot will be skipped, and > the following > + log message may appear on standby. > +<programlisting> > + LOG: could not synchronize replication slot "failover_slot" > + DETAIL: Synchronization could lead to data loss as the remote > slot needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby > has LSN 0/3003F28 and catalog xmin 756 > +</programlisting> > + If the logical replication slot is actively consumed by a > consumer, no further > + manual action is needed by the user, as the slot on primary will > be advanced > + automatically, and synchronization will proceed in the next > cycle. However, > + if no logical replication consumer is set up yet, to advance the slot, it > + is recommended to manually run the <link > linkend="pg-logical-slot-get-changes"> > + <function>pg_logical_slot_get_changes</function></link> or > + <link linkend="pg-logical-slot-get-binary-changes"> > + <function>pg_logical_slot_get_binary_changes</function></link> > on the primary > + slot and allow synchronization to proceed. > + </para> > + > > I have reworded the above as follows: > To enable periodic synchronization of replication slots, it is > recommended to activate sync_replication_slots on the standby server. > While manual synchronization is possible using > pg_sync_replication_slots, this function is primarily intended for > testing and debugging and should be used with caution. Automatic > synchronization via sync_replication_slots ensures continuous slot > updates, supporting seamless failover and maintaining high > availability. When slot synchronization is configured as recommended, > and the initial synchronization is performed either automatically or > manually via pg_sync_replication_slot, the standby can persist the > synchronized slot only if the following condition is met: The logical > replication slot on the primary must retain WALs and system catalog > rows that are still available on the standby. This ensures data > integrity and allows logical replication to continue smoothly after > promotion. > If the required WALs or catalog rows have already been purged from the > standby, the slot will not be persisted to avoid data loss. In such > cases, the following log message may appear: > > LOG: could not synchronize replication slot "failover_slot" > DETAIL: Synchronization could lead to data loss as the remote slot > needs WAL at LSN 0/3003F28 and catalog xmin 754, but the standby has > LSN 0/3003F28 and catalog xmin 756 > > If the logical replication slot is actively used by a consumer, no > manual intervention is needed; the slot will advance automatically, > and synchronization will resume in the next cycle. However, if no > consumer is configured, it is advisable to manually advance the slot > on the primary using pg_logical_slot_get_changes or > pg_logical_slot_get_binary_changes, allowing synchronization to > proceed. > > Let me know what you think of above? > Phrases like "... it is recommended..." and "... intended for testing and debugging .. " and "... should be used with caution." and "... it is advisable to..." seem like indicators that parts of the above description should be using SGML markup such as <caution> or <warning> or <note> instead of just plain text. ====== Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: