Re: common signal handler protection - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: common signal handler protection
Date
Msg-id 20231127221625.GB140989@nathanxps13
Whole thread Raw
In response to Re: common signal handler protection  (Andres Freund <andres@anarazel.de>)
Responses Re: common signal handler protection
Re: common signal handler protection
List pgsql-hackers
Thanks for taking a look.

On Wed, Nov 22, 2023 at 02:59:45PM -0800, Andres Freund wrote:
> On 2023-11-22 15:59:44 -0600, Nathan Bossart wrote:
>> +/*
>> + * Except when called with SIG_IGN or SIG_DFL, pqsignal() sets up this function
>> + * as the handler for all signals.  This wrapper handler function checks that
>> + * it is called within a process that the server knows about, and not a
>> + * grandchild process forked by system(3), etc.
> 
> Similar comment to earlier - the grandchildren bit seems like a dangling
> reference. And also too specific - I think we could encounter this in single
> user mode as well?
> 
> Perhaps it could be reframed to "postgres processes, as determined by having
> called InitProcessGlobals()"?

Eh, apparently my attempt at being clever didn't pan out.  I like your idea
of specifying InitProcessGlobals(), but I might also include "e.g., client
backends", too.

> Hm. I wonder if this indicates an issue.  Do the preceding changes perhaps
> make it more likely that a child process of the startup process could hang
> around for longer, because the signal we're delivering doesn't terminate child
> processes, because we'd just reset the signal handler?  I think it's fine for
> the startup process, because we ask the startup process to shut down with
> SIGTERM, which defaults to exiting.
> 
> But we do have a few processes that we do ask to shut down with other
> signals, wich do not trigger an exit by default, e.g. Checkpointer, archiver,
> walsender are asked to shut down using SIGUSR2 IIRC.  The only one where that
> could be an issue is archiver, I guess?

This did cross my mind.  AFAICT most default handlers already trigger an
exit (including SIGUSR2), and for the ones that don't, we'd just end up in
the same situation as if the signal arrived a moment later after the child
process has installed its own handlers.  And postmaster doesn't send
certain signals (e.g., SIGHUP) to the whole process group, so we don't have
the opposite problem where things like reloading configuration files
terminates these child processes.

So, I didn't notice any potential issues.  Did you have anything else in
mind?

>> -    /* not safe if forked by system(), etc. */
>> -    if (MyProc->pid != (int) getpid())
>> -        elog(PANIC, "AuxiliaryProcKill() called in child process");
>> -
>>      auxproc = &AuxiliaryProcs[proctype];
>>  
>>      Assert(MyProc == auxproc);
> 
> I think we should leave these checks. It's awful to debug situations where a
> proc gets reused and the cost of the checks is irrelevant. The checks don't
> just protect against child processes, they also protect e.g. against calling
> those functions twice (we IIRC had cases of that).

Sure, that's no problem.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: POC, WIP: OR-clause support for indexes
Next
From: Andres Freund
Date:
Subject: Re: Don't use bms_membership in places where it's not needed