Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module
Date
Msg-id CALj2ACWWy1YcngpCUn09AsXMfOzwjfNqbVosfoRY0vhhVWhVBw@mail.gmail.com
Whole thread Raw
In response to Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module  (Craig Ringer <craig.ringer@enterprisedb.com>)
Responses Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
Thanks Craig!

On Fri, Oct 23, 2020 at 9:37 AM Craig Ringer
<craig.ringer@enterprisedb.com> wrote:
>
> src/test/modules/test_shm_mq/worker.c appears to do the right thing the wrong way - it has its own custom handler
insteadof using die() or SignalHandlerForShutdownRequest(). 
>

handle_sigterm() and die() are similar except that die() has extra
handling(below) for exiting immediately when waiting for input/command
from the client.
    /*
     * If we're in single user mode, we want to quit immediately - we can't
     * rely on latches as they wouldn't work when stdin/stdout is a file.
     * Rather ugly, but it's unlikely to be worthwhile to invest much more
     * effort just for the benefit of single user mode.
     */
    if (DoingCommandRead && whereToSendOutput != DestRemote)
        ProcessInterrupts();

Having this extra handling is correct for normal backends as they can
connect directly to clients for reading input commands, the above if
condition may become true and ProcessInterrupts() may be called to
handle the SIGTERM immediately.

Note that DoingCommandRead can never be true in bg workers as it's
being set to true only in normal backend PostgresMain(). If we use
die()(instead of custom SIGTERM handlers such as handle_sigterm()) in
a bg worker/non-backend process, there are no chances that the above
part of the code gets hit and the interrupts are processed
immediately. And also here are the bg worker process that use die()
instead of their own custom handlers: parallel workers, autoprewarm
worker, autovacuum worker, logical replication launcher and apply
worker, wal sender process

I think we can also use die() instead of handle_sigterm() in
test_shm_mq/worker.c.

I attached a patch for this change.

>
> In contrast  src/test/modules/worker_spi/worker_spi.c looks plain wrong. Especially since it's quoted as an example
ofhow to do things right. It won't respond to SIGTERM at all while it's executing a query from its queue, no matter how
longthat query takes or whether it blocks. It can inhibit even postmaster shutdown as a result. 
>

Postmaster sends SIGTERM to all children(backends and bgworkers) for
both smart shutdown(wait for children to end their work, then shut
down.) and fast shutdown(rollback active transactions and shutdown
when they are gone.) For immediate shutdown SIGQUIT is sent to
children.(see pmdie()).

For each bg worker(so is for worker_spi.c),
SignalHandlerForCrashExit() is set as SIGQUIT handler in
InitPostmasterChild(), which does nothing but exits immediately. We
can not use quickdie() here because as a bg worker we don't have
to/can not send anything to client. We are good with
SignalHandlerForCrashExit() as with all other bg workers.

Yes, if having worker_spi_sigterm/SignalHandlerForShutdownRequest,
sometimes(as explained above) the worker_spi worker may not respond to
SIGTERM. I think we should be having die() as SIGTERM handler here (as
with normal backend and parallel workers).

Attaching another patch that has replaced custom SIGTERM handler
worker_spi_sigterm() with die() and custom SIGHUP handler
worker_spi_sighup() with standard SignalHandlerForConfigReload().

>
> I was going to lob off a quick patch to fix this by making both use quickdie() for SIGQUIT and die() for SIGTERM, but
afterreading for a bit I'm no longer sure what the right choice even is. I'd welcome some opinions. 
>

We can not use quickdie() here because as a bg worker we don't have
to/can not send anything to client. We are good with
SignalHandlerForCrashExit() as with all other bg workers.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] logical decoding of two-phase transactions
Next
From: Alvaro Herrera
Date:
Subject: Re: pg_proc.dat "proargmodes is not a 1-D char array"