Re: How can end users know the cause of LR slot sync delays? - Mailing list pgsql-hackers
| From | Shlok Kyal |
|---|---|
| Subject | Re: How can end users know the cause of LR slot sync delays? |
| Date | |
| Msg-id | CANhcyEWm2M+OgrnEZDLt8K91+N12SvTLYq6QMKgbmEWEukV60w@mail.gmail.com Whole thread Raw |
| In response to | Re: How can end users know the cause of LR slot sync delays? (Amit Kapila <amit.kapila16@gmail.com>) |
| List | pgsql-hackers |
On Fri, 21 Nov 2025 at 09:58, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Nov 21, 2025 at 8:52 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Tue, Nov 18, 2025 at 4:07 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > > > > > On Fri, 14 Nov 2025 at 14:13, Hayato Kuroda (Fujitsu) > > > <kuroda.hayato@fujitsu.com> wrote: > > > > > > > > Dear Shlok, > > > > > > > > Thanks for updating the patch. Few more comments. > > > > > > > > > > > > I’m not sure if this has already been discussed; I couldn’t find any > > > > > > > > mention of it in the thread. Why don’t we persist > > > > > > > > 'slot_sync_skip_reason' (it is outside of > > > > > > > > ReplicationSlotPersistentData)? If a slot wasn’t synced during the > > > > > > > > last cycle and the server restarts, it would be helpful to know the > > > > > > > > reason it wasn’t synced prior to the node restart. > > > > > > > > > > > > > Actually I did not think in this direction. I think it will be useful > > > > > to persist 'slot_sync_skip_reason'. I have made the change for the > > > > > same in the latest patch. > > > > > > > > Hmm, I'm wondering it should be written on the disk. Other attributes on the disk > > > > are essential to decode or replicate changes correctly, but sync status is not > > > > used for the purpose. Personally considered, slot sync would re-start soon after > > > > the reboot so that it is OK to start with empty. How about others? > > > > > > > > If we want to serialize the info, we should do further tasks: > > > > - update SLOT_VERSION > > > > - make the slot dirty then SaveSlotToPath() when the status is updated. > > > > > > > I agree with your point. Slot synchronization will restart shortly > > > after a reboot, so it seems reasonable to begin with an empty state > > > rather than persisting slot_sync_skip_reason. > > > For now, I’ve updated the patch so that slot_sync_skip_reason is no > > > longer persisted; its initialization is kept outside of > > > ReplicationSlotPersistentData. I’d also like to hear what others > > > think. > > > > > > > Users may even use an API to synchronize the slots rather than > > slotsync worker. In that case synchronization won't start immediately > > after server-restart. > > > > But I think after restart in most cases, the slot will be created > fresh as we persist the slot for the first time only when sync is > successful. Now, when the standby has not flushed the WAL > corresponding to remote_lsn (SS_SKIP_WAL_NOT_FLUSHED), the slotsync > can be skipped even for persisted slots but that should be rare and we > anyway won't be able to persist the slot_skip reason in other cases as > slot itself won't be persisted by that time. So, I feel keeping the > slot_sync_skip_reason in memory is sufficient. > I agree. I have added a comment why we decided to keep slotsync_skip_reason in_memory Attached the latest patch in [1]. [1]: https://www.postgresql.org/message-id/CANhcyEUiY4ENuoi7kZSsLJFLn6yA_-oPCKrek%3DBaMfFfY3%3DP1w%40mail.gmail.com Thanks Shlok Kyal
pgsql-hackers by date: