Re: VM corruption on standby - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: VM corruption on standby
Date
Msg-id aKT7qD0VkGhQgFJe@paquier.xyz
Whole thread Raw
In response to Re: VM corruption on standby  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Aug 19, 2025 at 03:55:41PM -0400, Tom Lane wrote:
> Yeah, I was coming to similar conclusions in the reply I just sent:
> we don't really want a policy that we can't put injection-point-based
> delays inside critical sections.  So that infrastructure is leaving
> something to be desired.

Yes, it doesn't make sense to restrict the use of injection point
waits within critical sections.  A simple switch that we could do is
to rely on a clock-based check in the wait() routine, removing the
condition variable part.  It costs in responsiveness because the
wakeup() routine would not be able to ping the wait() part to recheck
the shmem counter immediately.  But we could reduce the delay with a
variable recheck timing, say double the time to recheck the counter
after each loop, capped at a maximum of a couple of hundred ms so as
it's still good enough on fast machines, and does not stress too much
slow machines.  That costs a bit more in terms of clock calls and
delay checks, but it would be low-level enough that the internal
interrupts would not matter if we rely on syscalls, I guess?  And we
don't care about efficiency in this case.
--
Michael

Attachment

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: index prefetching
Next
From: Mihail Nikalayeu
Date:
Subject: Re: [BUG?] check_exclusion_or_unique_constraint false negative