common signal handler protection - Mailing list pgsql-hackers

From Nathan Bossart
Subject common signal handler protection
Date
Msg-id 20231121212008.GA3742740@nathanxps13
Whole thread Raw
Responses Re: common signal handler protection
Re: common signal handler protection
List pgsql-hackers
In commit 97550c0, I added a "MyProcPid == getpid()" check in the SIGTERM
handler for the startup process.  This ensures that processes forked by
system(3) (i.e., for restore_command) that have yet to install their own
signal handlers do not call proc_exit() upon receiving SIGTERM.  Without
this protection, both the startup process and the restore_command process
might try to remove themselves from the PGPROC shared array (among other
things), which can end badly.

Since then, I've been exploring a more general approach that would offer
protection against similar issues in the future.  We probably don't want
signal handlers in these grandchild processes to touch shared memory at
all.  The attached 0001 is an attempt at adding such protection for all
handlers installed via pqsignal().  In short, it stores the actual handler
functions in a separate array, and sigaction() is given a wrapper function
that performs the "MyProcPid == getpid()" check.  If that check fails, the
wrapper function installs the default signal handler and calls it.

Besides allowing us to revert commit 97550c0 (see attached 0003), this
wrapper handler could also restore errno, as shown in 0002.  Right now,
individual signal handlers must do this today as needed, but that seems
easy to miss and prone to going unnoticed for a long time.

I see two main downsides of this proposal:

* Overhead: The wrapper handler calls a function pointer and getpid(),
  which AFAICT is a real system call on most platforms.  That might not be
  a tremendous amount of overhead, but it's not zero, either.  I'm
  particularly worried about signal-heavy code like synchronous
  replication.  (Are there other areas that should be tested?)  If this is
  a concern, perhaps we could allow certain processes to opt out of this
  wrapper handler, provided we believe it is unlikely to fork or that the
  handler code is safe to run in grandchild processes.

* Race conditions: With these patches, pqsignal() becomes quite racy when
  used within signal handlers.  Specifically, you might get a bogus return
  value.  However, there are no in-tree callers of pqsignal() that look at
  the return value (and I don't see any reason they should), and it seems
  unlikely that pqsignal() will be used within signal handlers frequently,
  so this might not be a deal-breaker.  I did consider trying to convert
  pqsignal() into a void function, but IIUC that would require an SONAME
  bump.  For now, I've just documented the bogosity of the return values.

Thoughts?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: proposal: possibility to read dumped table's name from file
Next
From: Andrew Dunstan
Date:
Subject: Re: Remove distprep