Thread: Inconsistent behavior of smart shutdown handling for queries with and without parallel workers

Hi,

In case of smart shutdown postmaster receives SIGTERM from the pg_ctl,
it "disallows new connections, but lets existing sessions end their
work normally". Which means that it doesn't abort any ongoing txns in
any of the sessions and it lets the sessions to exit(on their own) and
then the postmaster is shut down. Looks like this behavior is true
only if the sessions are executing non-parallel queries. Parallel
queries are getting aborted, see [1].

Although the postmaster receives two different signals for
smart(SIGTERM) and fast(SIGINT) shutdowns, it only sends SIGTERM to
bgworkers for both the cases. (see pmdie() -> SignalSomeChildren() in
postmaster.c). In
StartBackgroundWorker(), bgworkers have the bgworker_die() as default
handler for SIGTERM, which just reports FATAL error. Is this handler
correct for both fast and smart shutdown for all types of bgworkers?

For parallel workers in ParallelWorkerMain(), SIGTERM handler gets
changed to die()(which means for both smart and fast shutdowns, the
same handler gets used), which sets ProcDiePending = true; and later
if the parallel workers try to CHECK_FOR_INTERRUPTS(); (for parallel
seq scan, it is done in ExecScanFetch()), since ProcDiePending was set
to true, the parallel workers throw error "terminating connection due
to administrator command" in ProcessInterrupts().
Having die() as a handler for fast shutdown may be correct, but for
smart shutdown, as mentioned in $subject, it looks inconsistent.

1. In general, do we need to allow postmaster to send different
signals to bgworkers for fast and smart shutdowns and let them
differentiate the two modes(if needed)?
2. Do we need to handle smart shutdown for dynamic bgworkers(parallel
workers) with a different signal than SIGTERM and retain SIGTERM
handler die() as is, since SIGTERM is being sent to bgworkers from
other places as well? If we do so, we can block that new signal until
the parallel workers finish the current query execution or completely
ignore it in ParallelWorkerMain(). If the bgw_flags flag is
BGWORKER_CLASS_PARALLEL, we can do some changes in postmaster's
SignalSomeChildren() to detect and send that new signal. Looks like
SIGUSR2 remains currently ignored for dynamic bgworker, and can be
used for this purpose.

Thoughts?

Thanks @vignesh C  for inputs and writeup review.

[1] (smart shutdown issued with pg_ctl, while the parallel query is running).
postgres=# EXPLAIN ANALYZE SELECT COUNT(*) FROM t_test t1, t_test t2
WHERE t1.many = t2.many;
ERROR:  terminating connection due to administrator command
CONTEXT:  parallel worker

[2] The usual behavior of: smart shutdown - lets existing sessions end
their work normally and fast shutdown - abort their current
transactions and exit promptly.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> In case of smart shutdown postmaster receives SIGTERM from the pg_ctl,
> it "disallows new connections, but lets existing sessions end their
> work normally". Which means that it doesn't abort any ongoing txns in
> any of the sessions and it lets the sessions to exit(on their own) and
> then the postmaster is shut down. Looks like this behavior is true
> only if the sessions are executing non-parallel queries. Parallel
> queries are getting aborted, see [1].

Hm.  I kind of wonder why we're killing *anything* early in the
smart-shutdown case.  postmaster.c has (in pmdie()):

                /* autovac workers are told to shut down immediately */
                /* and bgworkers too; does this need tweaking? */
                SignalSomeChildren(SIGTERM,
                                   BACKEND_TYPE_AUTOVAC | BACKEND_TYPE_BGWORKER);
                /* and the autovac launcher too */
                if (AutoVacPID != 0)
                    signal_child(AutoVacPID, SIGTERM);
                /* and the bgwriter too */
                if (BgWriterPID != 0)
                    signal_child(BgWriterPID, SIGTERM);
                /* and the walwriter too */
                if (WalWriterPID != 0)
                    signal_child(WalWriterPID, SIGTERM);

and it seems like every one of those actions is pretty damfool if we want
the existing sessions to continue to have normal performance, quite aside
from whether we're breaking parallel queries.  Indeed, to the extent that
this is hurting performance, we can expect the existing sessions to take
*longer* to finish, making this pretty counterproductive.

So I'm thinking we should move all of these actions to happen only after
the regular children are all gone.

            regards, tom lane



I think the inconsistent behaviour reported in this thread gets
resolved with the approach and patch being discussed in [1].

>
> 1. In general, do we need to allow postmaster to send different
> signals to bgworkers for fast and smart shutdowns and let them
> differentiate the two modes(if needed)?
>

Is there any way the bgworkers(for that matter, any postmaster's child
process) knowing that there's a smart shutdown pending? This is
useful, if any of the bgworker(if not parallel workers) want to
differentiate the two modes i.e. smart and fast shutdown modes and
smartly finish of their work.

[1] - https://www.postgresql.org/message-id/469199.1597337108%40sss.pgh.pa.us

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
> Is there any way the bgworkers(for that matter, any postmaster's child
> process) knowing that there's a smart shutdown pending? This is
> useful, if any of the bgworker(if not parallel workers) want to
> differentiate the two modes i.e. smart and fast shutdown modes and
> smartly finish of their work.

With the patch I'm working on, the approach is basically that smart
shutdown changes nothing except for not allowing new connections ...
until the last regular connection is gone, at which point it starts to
act exactly like fast shutdown.  So in those terms there is no need for
bgworkers to know the difference.  If a bgworker did act differently
during the initial phase of a smart shutdown, that would arguably be
a bug, just as it's a bug that parallel query isn't working.

            regards, tom lane