Home > mailing lists

Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Date	April 10, 2022 00:39:16
Msg-id	20220409213916.4ydygnkf37dd4sw2@alap3.anarazel.de Whole thread Raw
In response to	Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

Hi,

On 2022-04-09 17:00:41 -0400, Tom Lane wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > Unlike most "procsignal" handler routines, RecoveryConflictInterrupt()
> > doesn't just set a sig_atomic_t flag and poke the latch.  Is the extra
> > stuff it does safe?  For example, is this call stack OK (to pick one
> > that jumps out, but not the only one)?
> 
> > procsignal_sigusr1_handler
> > -> RecoveryConflictInterrupt
> >  -> HoldingBufferPinThatDelaysRecovery
> >   -> GetPrivateRefCount
> >    -> GetPrivateRefCountEntry
> >     -> hash_search(...hash table that might be in the middle of an update...)
> 
> Ugh.  That one was safe before somebody decided we needed a hash table
> for buffer refcounts, but it's surely not safe now.

Mea culpa. This is 4b4b680c3d6d - from 2014.

> Which, of course, demonstrates the folly of allowing signal handlers to call
> much of anything; but especially doing so without clearly marking the called
> functions as needing to be signal safe.

Yea. Particularly when just going through bufmgr and updating places that look
at pin counts, it's not immediately obvious that
HoldingBufferPinThatDelaysRecovery() runs in a signal handler. Partially
because RecoveryConflictInterrupt() - which is mentioned in the comment above
HoldingBufferPinThatDelaysRecovery() - sounds a lot like it's called from
ProcessInterrupts(), which doesn't run in a signal handler...

RecoveryConflictInterrupt() calls a lot of functions, some of which quite
plausibly could be changed to not be signal safe, even if they currently are.

Is there really a reason for RecoveryConflictInterrupt() to run in a signal
handler? Given that we only react to conflicts in ProcessInterrupts(), it's
not immediately obvious that we need to do anything in
RecoveryConflictInterrupt() but set some flags.  There's probably some minor
efficiency gains, but that seems unconvincing.

The comments really need a rewrite - it sounds like
RecoveryConflictInterrupt() will error out itself:

                /*
                 * If we can abort just the current subtransaction then we are
                 * OK to throw an ERROR to resolve the conflict. Otherwise
                 * drop through to the FATAL case.
                 *
                 * XXX other times that we can throw just an ERROR *may* be
                 * PROCSIG_RECOVERY_CONFLICT_LOCK if no locks are held in
                 * parent transactions
                 *
                 * PROCSIG_RECOVERY_CONFLICT_SNAPSHOT if no snapshots are held
                 * by parent transactions and the transaction is not
                 * transaction-snapshot mode
                 *
                 * PROCSIG_RECOVERY_CONFLICT_TABLESPACE if no temp files or
                 * cursors open in parent transactions
                 */

it's technically not *wrong* because it's setting up state that then leads to
ERROR / FATAL being thrown, but ...

Greetings,

Andres Freund

pgsql-hackers by date:

From: Tom Lane
Date: 10 April 2022, 00:00:41
Subject: Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?

From: Andres Freund
Date: 10 April 2022, 01:00:54
Subject: Re: failures in t/031_recovery_conflict.pl on CI

Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler? - Mailing list pgsql-hackers

Previous

Next