Re: Improve pg_sync_replication_slots() to wait for primary to advance - Mailing list pgsql-hackers
From | shveta malik |
---|---|
Subject | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
Date | |
Msg-id | CAJpy0uAoyk+ASXpbOa8exk60XutKyHtuBXhRo0ccRO3svTSO_Q@mail.gmail.com Whole thread Raw |
In response to | Re: Improve pg_sync_replication_slots() to wait for primary to advance (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Improve pg_sync_replication_slots() to wait for primary to advance
|
List | pgsql-hackers |
On Mon, Aug 4, 2025 at 3:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Aug 4, 2025 at 12:19 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Mon, Aug 4, 2025 at 11:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Aug 1, 2025 at 2:50 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > 5) > > > > I tried a test where there were 4 slots on the publisher, where one > > > > was getting used while the others were not. Initiated > > > > pg_sync_replication_slots on standby. Forced unused slots to be > > > > invalidated by setting idle_replication_slot_timeout=60 on primary, > > > > due to which API finished but gave a warning: > > > > > > > > postgres=# SELECT pg_sync_replication_slots(); > > > > WARNING: aborting initial sync for slot "failover_slot" > > > > DETAIL: This slot was invalidated on the primary server. > > > > WARNING: aborting initial sync for slot "failover_slot2" > > > > DETAIL: This slot was invalidated on the primary server. > > > > WARNING: aborting initial sync for slot "failover_slot3" > > > > DETAIL: This slot was invalidated on the primary server. > > > > pg_sync_replication_slots > > > > --------------------------- > > > > > > > > (1 row) > > > > > > > > Do we need these warnings here? I think we can have it as a LOG rather > > > > than having it on console. Thoughts? > > > > > > > > > > What is the behaviour of a slotsync worker in the same case? I don't > > > see any such WARNING messages in the code of slotsync worker, so why > > > do we want a different behaviour here? > > > > > > > We don’t have continuous waiting in the slot-sync worker if the remote > > slot is behind the local slot. But if during the first sync cycle the > > remote slot is behind, we keep the local slot as a temporary slot. In > > the next sync cycle, if we find the remote slot is invalidated, we > > mark the local slot as invalidated too, keeping it in this temporary > > state. There are no LOG or WARNING messages in this case. When the > > slot-sync worker stops or shuts down (like during promotion), it > > cleans up this temporary slot. > > > > Now, for the API behavior: if the remote slot is behind the local > > slot, the API enters a wait loop and logs: > > > > LOG: waiting for remote slot "failover_slot" LSN (0/3000060) and > > catalog xmin (755) to pass local slot LSN (0/3000060) and catalog xmin > > (770) > > > > If it keeps waiting, every 10 seconds it logs: > > LOG: continuing to wait for remote slot "failover_slot" LSN > > (0/3000060) and catalog xmin (755) to pass local slot LSN (0/3000060) > > and catalog xmin (770) > > > > If the remote slot becomes invalidated during this wait, currently it > > logs a WARNING and moves to syncing the next slot: > > WARNING: aborting initial sync for slot "failover_slot" as the slot > > was invalidated on primary > > > > I think this WARNING is too strong. We could change it to a LOG > > message instead, mark the local slot as invalidated, exit the wait > > loop, and move on to syncing the next slot. > > > > Even though this LOG is not there in slotsync worker case, I think it > > makes more sense in API case due to continuous LOGs which suggested > > that API was waiting to sync this slot in wait-loop and thus we shall > > conclude it either by saying wait-over (like we do in successful sync > > case) or we can say 'LOG: aborting wait as the remote slot was > > invalidated' instead of above WARNING message. What do you suggest? > > > > I also think LOG is a better choice for this because there is nothing > we can expect users to do even after seeing this. I feel this is more > of an info for users. > Yes, it is more of an info for users. > 1. > update_and_persist_local_synced_slot() > { > ... > + /* > + * For SQL API synchronization, we wait for the remote slot to catch up > + * here, since we can't assume the SQL API will be called again soon. > + * We will retry the sync once the slot catches up. > + * > + * Note: This will return false if a promotion is triggered on the > + * standby while waiting, in which case we stop syncing and drop the > + * temporary slot. > + */ > + if (!wait_for_primary_slot_catchup(wrconn, remote_slot)) > + return false; > > Why is the wait added at this level? Shouldn't it be at API level aka > in SyncReplicationSlots() or pg_sync_replication_slots() similar to > what we do in ReplSlotSyncWorkerMain() for slotsync workers? The initial goal was to perform a single sync cycle for all slots. The logic was simple, if any slot couldn’t be synced because its remote slot was lagging, we would wait for the remote slot to catch up, and only then move on to the next slot. But if we consider moving wait logic to SyncReplicationSlots(), we will necessarily have to go with the logic that it will attempt to sync all slots in the first sync cycle, skipping those where remote slots are lagging. It will then continue with multiple sync cycles until all slots are successfully synced. But if new remote slots are added in the meantime, they will be picked up in the next cycle, and the API then has to wait on those as well and this cycle may go on for longer. If we want to avoid continuously syncing newly added slots in later cycles and instead focus only on the ones that failed to sync during the first attempt, one approach is to maintain a list of failed slots from the initial cycle and only retry those in subsequent attempts. But this will add complexity to the implementation. IMO, attempting multiple sync cycles essentially makes the API behave more like slotsyncworker, which might not be desirable. I feel that performing just one sync cycle in the API is more in line with the expected behavior. And for that, the current implementation of wait-logic seems simpler. But let me know if you think otherwise or I have not understood your point clearly. I am open to more approaches/ideas here. thanks Shveta
pgsql-hackers by date: