Re: BUG #18961: Race scenario where max_standby_streaming_delay is not honored - Mailing list pgsql-bugs

From Dilip Kumar
Subject Re: BUG #18961: Race scenario where max_standby_streaming_delay is not honored
Date
Msg-id CAFiTN-sXBgoV7KZ1cRED1aMgOFWdUUU0XRqpkiGFWfm9C0T7xw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #18961: Race scenario where max_standby_streaming_delay is not honored  (Anthony Hsu <erwaman@gmail.com>)
List pgsql-bugs
On Fri, Jun 27, 2025 at 11:41 AM Anthony Hsu <erwaman@gmail.com> wrote:
>
>
>
> On Thu, Jun 26, 2025 at 10:05 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>>
>> On Thu, Jun 19, 2025 at 9:28 PM Anthony Hsu <erwaman@gmail.com> wrote:
>> >
>> > Yes, another option might be to change the wakeup logic so that the startup process is still woken up even if new
processeshave acquired the pin. The main thing is the startup process should be woken up promptly so that it can
recheckif it can acquire the cleanup lock and if not, send PROCSIG_RECOVERY_CONFLICT_BUFFERPIN again to cancel any new
backends.
>> >
>> > Before the standby limit time (default 30s) is reached, ResolveRecoveryConflictWithBufferPin will enable both a
standbylimit timeout and a deadlock timeout (default 1s). So if someone is holding a conflicting buffer pin for a long
time,due to the deadlock timeout, the startup process will get woken up every 1s and recheck until we reach the standby
limit,at which point it'll send the PROCSIG_RECOVERY_CONFLICT_BUFFERPIN. But after reaching standby limit, the next
timethe startup process does ResolveRecoveryConflictWithBufferPin, it only sends PROCSIG_RECOVERY_CONFLICT_BUFFERPIN
withoutenabling the timeouts, which leads to the possibility of this race. So I thought a simple solution to address
thisrace would be to just always enable the timeouts. 
>>
>> Again looking at the code and I got confused with the code placement
>> of ProcWaitForSignal(WAIT_EVENT_BUFFER_PIN);, I mean if we are already
>> behind then we broadcast 'PROCSIG_RECOVERY_CONFLICT_BUFFERPIN' and
>> then wait for WAIT_EVENT_BUFFER_PIN signal, Otherwise, we will wait on
>> timeouts, and once woken up by timeout we broadcast
>> 'PROCSIG_RECOVERY_CONFLICT_BUFFERPIN' but now we don't bother to wait
>> for WAIT_EVENT_BUFFER_PIN.
>
>
> If we get woken by the standby_delay timeout, we broadcast PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and then return from
thismethod (ResolveRecoveryConflictWithBufferPin) back to LockBufferForCleanup, which will then loop back to the
beginningof the for loop [1]. It will then check if it is the exclusive pinner again, and if not, re-enter
ResolveRecoveryConflictWithBufferPin,and then wait for WAIT_EVENT_BUFFER_PIN. 
>
Oh yeah, that's correct.

--
Regards,
Dilip Kumar
Google



pgsql-bugs by date:

Previous
From: Anthony Hsu
Date:
Subject: Re: BUG #18961: Race scenario where max_standby_streaming_delay is not honored
Next
From: Richard Guo
Date:
Subject: Re: BUG #18953: Planner fails to build plan for complex query with LATERAL references