Re: Fix slotsync worker busy loop causing repeated log messages - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Fix slotsync worker busy loop causing repeated log messages
Date
Msg-id CAA4eK1KLk+TWyNPJ=z6SzQQXySc-N9Gs3eR-QKfV+MX7vfJWiw@mail.gmail.com
Whole thread
In response to Fix slotsync worker busy loop causing repeated log messages  (Fujii Masao <masao.fujii@gmail.com>)
Responses RE: Fix slotsync worker busy loop causing repeated log messages
Re: Fix slotsync worker busy loop causing repeated log messages
List pgsql-hackers
On Fri, Feb 27, 2026 at 8:34 PM Fujii Masao <masao.fujii@gmail.com> wrote:
>
> Normally, the slotsync worker updates the standby slot using the primary's slot
> state. However, when confirmed_flush_lsn matches but restart_lsn does not,
> the worker does not actually update the standby slot. Despite that, the current
> code of update_local_synced_slot() appears to treat this situation as if
> an update occurred. As a result, the worker sleeps only for the minimum
> interval (200 ms) before retrying. In the next cycle, it again assumes
> an update happened, and continues looping with the short sleep interval,
> causing the repeated logical decoding log messages. Based on a quick analysis,
> this seems to be the root cause.
>
> I think update_local_synced_slot() should return false (i.e., no update
> happened) when confirmed_flush_lsn is equal but restart_lsn differs between
> primary and standby.
>

We expect that in such a case update_local_synced_slot() should
advance local_slot's 'restart_lsn' via
LogicalSlotAdvanceAndCheckSnapState(), otherwise, it won't go in the
cheap code path next time. Normally, restart_lsn advancement should
happen when we process XLOG_RUNNING_XACTS and call
SnapBuildProcessRunningXacts(). In this particular case as both
restart_lsn and confirmed_flush_lsn are the same (0/03000140), the
machinery may not be processing XLOG_RUNNING_XACTS record. I have not
debugged the exact case yet but you can try by emitting some more
records on publisher, it should let the standby advance the slot. It
is possible that we can do something like you are proposing to silence
the LOG messages but we should know what is going on here.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Nico Williams
Date:
Subject: Re: [PATCH] Add PQgetThreadLock() to expose the Kerberos/Curl mutex
Next
From: "Jonathan Gonzalez V."
Date:
Subject: Re: [oauth] Add TLS support to OAuth tests