Re: BUG #18961: Race scenario where max_standby_streaming_delay is not honored - Mailing list pgsql-bugs
From | Anthony Hsu |
---|---|
Subject | Re: BUG #18961: Race scenario where max_standby_streaming_delay is not honored |
Date | |
Msg-id | CALQc50hKhaoqZbk=CYRKx5gK9NykGDLUqb3HzMKF+Ezm1=Lw-A@mail.gmail.com Whole thread Raw |
In response to | BUG #18961: Race scenario where max_standby_streaming_delay is not honored (PG Bug reporting form <noreply@postgresql.org>) |
List | pgsql-bugs |
Yes, another option might be to change the wakeup logic so that the startup process is still woken up even if new processes have acquired the pin. The main thing is the startup process should be woken up promptly so that it can recheck if it can acquire the cleanup lock and if not, send PROCSIG_RECOVERY_CONFLICT_BUFFERPIN again to cancel any new backends.
Before the standby limit time (default 30s) is reached, ResolveRecoveryConflictWithBufferPin will enable both a standby limit timeout and a deadlock timeout (default 1s). So if someone is holding a conflicting buffer pin for a long time, due to the deadlock timeout, the startup process will get woken up every 1s and recheck until we reach the standby limit, at which point it'll send the PROCSIG_RECOVERY_CONFLICT_BUFFERPIN. But after reaching standby limit, the next time the startup process does ResolveRecoveryConflictWithBufferPin, it only sends PROCSIG_RECOVERY_CONFLICT_BUFFERPIN without enabling the timeouts, which leads to the possibility of this race. So I thought a simple solution to address this race would be to just always enable the timeouts.
For context, the reason I noticed this issue and am interested in getting it fixed is that I have seen cases where the startup process gets stalled for minutes and sometimes hours due to conflicting buffer pins, leading to high replay lag, even though the standby limit has not been changed (still the default 30s). My expectation is that once the standby limit has expired, conflicting backends should be promptly canceled.
On Thu, Jun 19, 2025 at 8:18 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Thu, Jun 19, 2025 at 5:25 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 18961
> Logged by: Anthony Hsu
> Email address: erwaman@gmail.com
> PostgreSQL version: 17.5
> Operating system: Linux
> Description:
>
> In the current ResolveRecoveryConflictWithBufferPin implementation in
> standby.c, I think there's a race scenario where a backend holding a
> conflicting buffer pin won't receive a PROCSIG_RECOVERY_CONFLICT_BUFFERPIN
> message promptly:
> 1. Assume max_standby_streaming_delay has expired when the startup process
> enters ResolveRecoveryConflictWithBufferPin
> 2. Assume backend 1 holds a conflicting buffer pin while backend 2 does not
> 3. Since we are past the standby limit time, the startup process broadcasts
> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN here [1] without enabling any timeouts
> 4. Then the startup process waits to be woken up via
> ProcWaitForSignal(WAIT_EVENT_BUFFER_PIN) here [2]
> 5. Suppose backend 2 receives PROCSIG_RECOVERY_CONFLICT_BUFFERPIN first,
> sees it does not hold a conflicting buffer pin, and *then* proceeds to pin
> the buffer
> 6. Suppose then backend 1 receives PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
> processes interrupts, and cancels itself. During cleanup, in UnpinBuffer(),
> it will see the pin count is still > 1 (startup process + backend 2 have it
> pinned), so it will NOT wake up the startup process.
> 7. The startup process will only get woken up once backend 2 unpins the
> buffer and the pin count reaches 1 (or some other signal causes the startup
> process's latch to be set). Only then will it try to acquire the cleanup
> lock again and broadcast another PROCSIG_RECOVERY_CONFLICT_BUFFERPIN message
> if it fails to acquire the cleanup lock.
> The unexpected behavior is in step (7), where it could be arbitrarily long
> until the startup process is woken up again. The expected behavior is that
> since max_standby_streaming_delay has already expired, the startup process
> should wake up quickly and broadcast another
> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN message if there are still conflicting
> backends.
> I was able to reproduce this scenario with some custom code to control the
> execution sequence.
> One way to fix this scenario is to just remove the `if` block here [3]
> entirely so that we always enable the STANDBY_TIMEOUT and
> STANDBY_DEADLOCK_TIMEOUT timeouts.
> [1]
> https://github.com/postgres/postgres/blob/45c357e0e85d2dffe7af5440806150124a725a01/src/backend/storage/ipc/standby.c#L805
> [2]
> https://github.com/postgres/postgres/blob/45c357e0e85d2dffe7af5440806150124a725a01/src/backend/storage/ipc/standby.c#L842
> [3]
> https://github.com/postgres/postgres/blob/45c357e0e85d2dffe7af5440806150124a725a01/src/backend/storage/ipc/standby.c#L800-L806
I agree this looks like a race condition, but I am not sure about the
proposed solution. ResolveRecoveryConflictWithBufferPin() is called by
LockBufferForCleanup() to wait for a process currently holding a
buffer pin. However, new processes can still acquire the pin.. So I
think the problem lies in the logic of the wakeup mechanism of the
UnpinBuffer() no? The intended behavior is for the startup process to
be woken regardless of Backend 2 subsequently acquiring the pin, but
the current tracking mechanism is insufficient.
--
Regards,
Dilip Kumar
pgsql-bugs by date:
Previous
From: Junwang ZhaoDate:
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.
Next
From: Dilip KumarDate:
Subject: Re: BUG #18959: Name collisions of expression indexes during parallel Index creations on a pratitioned table.