Home > mailing lists

Re: SIGQUIT handling, redux - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: SIGQUIT handling, redux
Date	September 10, 2020 02:07:06
Msg-id	148145.1599703626@sss.pgh.pa.us Whole thread Raw
In response to	SIGQUIT handling, redux (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	RE: SIGQUIT handling, redux Re: SIGQUIT handling, redux
List	pgsql-hackers

Tree view

I wrote:
> Of course, this is only safe if the SIGQUIT handler is safe to be invoked
> anywhere, so I did a quick survey of backend signal handlers to see if
> that is true.

This is straying a bit from the stated topic of this thread, but ...
I did some further looking around to see whether there were any
unsafe signal handlers besides SIGQUIT ones.  The situation is not
too awful, but I did find several issues not already mentioned
in this thread:

StartupProcShutdownHandler (SIGTERM)

This conditionally calls proc_exit(1).  The conditions boil down
to are-we-interrupting-a-system(3)-call, so who knows how safe
that is?  I wouldn't care to bet that system() doesn't use malloc,
for instance.  Still, the odds are very good that if a signal did
arrive, it'd be interrupting system()'s waitpid() or equivalent
kernel call, which is likely safe enough.

bgworker_die (SIGTERM)

Calls ereport(FATAL).  This is surely not any safer than, say,
quickdie().  No, it's worse, because at least that won't try
to go out via proc_exit().

FloatExceptionHandler (SIGFPE)

Calls ereport(ERROR).  This might be okay, though, since the
trap should be synchronous with the offending calculation.
Besides, if you're risking divide by zero or the like in
critical code, You're Doing It Wrong.

RecoveryConflictInterrupt (called from SIGUSR1)

Calls a whole boatload of state tests that were never designed
to be interrupt-safe, such as transaction-state-related inquiries
in xact.c.  The lack of any explicit awareness in this code that
it's in a signal handler doesn't discourage people from inserting
even more dubious stuff.  I think this needs to be burned to the
ground and rewritten.

StandbyDeadLockHandler (from SIGALRM)
StandbyTimeoutHandler (ditto)

Calls CancelDBBackends, which just for starters tries to acquire
an LWLock.  I think the only reason we've gotten away with this
for this long is the high probability that by the time either
timeout fires, we're going to be blocked on a semaphore.

I don't have any ideas about how to fix any of these things,
but I thought getting the problems on the record would be good.

            regards, tom lane

pgsql-hackers by date:

From: Justin Pryzby
Date: 10 September 2020, 02:00:50
Subject: Re: please update ps display for recovery checkpoint

From: "osumi.takamichi@fujitsu.com"
Date: 10 September 2020, 02:34:25
Subject: RE: extension patch of CREATE OR REPLACE TRIGGER

Re: SIGQUIT handling, redux - Mailing list pgsql-hackers

Previous

Next