Re: Proposal for Signal Detection Refactoring - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Proposal for Signal Detection Refactoring
Date
Msg-id 20190214190942.4o6w6mx5jjcec6p2@alap3.anarazel.de
Whole thread Raw
In response to Re: Proposal for Signal Detection Refactoring  (Chris Travers <chris.travers@adjust.com>)
Responses Re: Proposal for Signal Detection Refactoring  (Chris Travers <chris.travers@adjust.com>)
List pgsql-hackers
Hi,

On 2019-01-23 11:55:09 +0100, Chris Travers wrote:
> +Implementation Notes on Globals and Signal/Event Handling
> +=========================================================
> +
> +The approch to signal handling in PostgreSQL is designed to strictly conform
> +with the C89 standard and designed to run with as few platform assumptions as
> +possible.

I'm not clear as to what that means. For one we don't target C99
anymore, for another a lot of the signal handling stuff we do isn't
defined by C99, but versions of posix.


> +The primary constraint in signal handling is that things are interruptable.
> +This means that the signal handler may be interrupted at any point and that
> +execution may return to it at a later point in time.

That seems wrong. The primary problem is that *non* signal handler code
can be interrupted at ay time. Sure signal handlers interrupting each
other is a thing, but comparatively that doesn't add a ton of
complexity.


> +How PostgreSQL Handles Signals
> +------------------------------
> +
> +Most signals (except SIGSEGV, and SIGKILL) are blocked by PostgreSQL
> +during process startup.  Certain signals are given specific meaning and
> +trapped by signal handlers.  The primary signal handlers of the backends, 
> +located in src/backend/tcop/postgres.c, typically just write to variables of
> +sig_atomic_t (as documented below) and return control back to the main code.

Note that the other absolutely crucial thing is to save/restore errno.

I don't really think that signals handlers "return control back to the
main code".


> +An exception is made for SIGQUIT which is used by the postmaster to terminate
> +backend sessions quickly when another backend dies so that the postmaster
> +may re-initialize shared memory and otherwise return to a known-good state.
> +
> +The signals are then checked later when the CHECK_FOR_INTERRUPTS() macro is

s/signals/variables/


> +called.  This macro conditionally calls CheckPendingInterrupts in 
> +src/backend/tcop/postgres.c if InterruptPending is set.  This allows for
> +query cancellation and process termination to be done, under ordiary cases,
> +in a timely and orderly way, without posing problems for shared resources such
> +as shard memory and semaphores.

s/shard/shared/


> +CHECK_FOR_INTERRUPTS() may be called only when there is no more cleanup to do
> +because the query or process may be aborted.  However query cancellation
> +requests and SIGTERM signals will not be processed until the next time this is
> +called.

I don't think that's correct. Most resources can be cleaned up at
transaction abort.


> +Checking and Handling Interrupts
> +--------------------------------
> +
> +CHECK_FOR_SIGNALS() may cause a non-local exit because the function it wraps
> +utilizes PostgreSQL's built-in exception framework (ereport) to abort queries
> +and can call exit() to exit a orocess.

You mean CHECK_FOR_INTERRUPTS?


> +For this reason, it is imperative that CHECK_FOR_INTERRUPTS() is not called in
> +places that require additional cleanup, such as dynamic shared memory, temporary
> +files, or any other shared resource.  Instead CHECK_FOR_INTERRUPTS() should be
> +called at the next point when it is safe to do so.

That's largely wrong, there's cleanup mechanisms fo rmost of the above.


Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Using POPCNT and other advanced bit manipulation instructions
Next
From: Andres Freund
Date:
Subject: Re: INSTALL file