Hi,
On 2025-06-08 22:33:39 +0800, Xuneng Zhou wrote:
> This patch implements progressive backoff in XactLockTableWait() and
> ConditionalXactLockTableWait().
>
> As Kevin reported in this thread [1], XactLockTableWait() can enter a
> tight polling loop during logical replication slot creation on standby
> servers, sleeping for fixed 1ms intervals that can continue for a long
> time. This creates significant CPU overhead.
>
> The patch implements a time-based threshold approach based on Fujii’s
> idea [1]: keep sleeping for 1ms until the total sleep time reaches 10
> seconds, then start exponential backoff (doubling the sleep duration
> each cycle) up to a maximum of 10 seconds per sleep. This balances
> responsiveness for normal operations (which typically complete within
> seconds) against CPU efficiency for the long waits in some logical
> replication scenarios.
ISTM that this is going to wrong way - the real problem is that we seem to
have extended periods where XactLockTableWait() doesn't actually work, not
that the sleep time is too short. The sleep in XactLockTableWait() was
intended to address a very short race, not something that's essentially
unbound.
Greetings,
Andres Freund