Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Date
Msg-id 20220409230013.yt7siryxxo4yujhy@alap3.anarazel.de
Whole thread Raw
In response to Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?  (Andres Freund <andres@anarazel.de>)
Responses Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
Hi,

On 2022-04-09 14:39:16 -0700, Andres Freund wrote:
> On 2022-04-09 17:00:41 -0400, Tom Lane wrote:
> > Thomas Munro <thomas.munro@gmail.com> writes:
> > > Unlike most "procsignal" handler routines, RecoveryConflictInterrupt()
> > > doesn't just set a sig_atomic_t flag and poke the latch.  Is the extra
> > > stuff it does safe?  For example, is this call stack OK (to pick one
> > > that jumps out, but not the only one)?
> > 
> > > procsignal_sigusr1_handler
> > > -> RecoveryConflictInterrupt
> > >  -> HoldingBufferPinThatDelaysRecovery
> > >   -> GetPrivateRefCount
> > >    -> GetPrivateRefCountEntry
> > >     -> hash_search(...hash table that might be in the middle of an update...)
> > 
> > Ugh.  That one was safe before somebody decided we needed a hash table
> > for buffer refcounts, but it's surely not safe now.
> 
> Mea culpa. This is 4b4b680c3d6d - from 2014.

Whoa. There's way worse: StandbyTimeoutHandler() calls
SendRecoveryConflictWithBufferPin(), which calls CancelDBBackends(), which
acquires lwlocks etc.

Which very plausibly is the cause for the issue I'm investigating in
https://www.postgresql.org/message-id/20220409220054.fqn5arvbeesmxdg5%40alap3.anarazel.de

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "Jonathan S. Katz"
Date:
Subject: Re: Commitfest wrapup
Next
From: Andres Freund
Date:
Subject: Re: failures in t/031_recovery_conflict.pl on CI