Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
Date
Msg-id 20210325005825.t6jtm67cpkmvr2ja@alap3.anarazel.de
Whole thread Raw
In response to Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
Hi,

On 2021-03-24 18:36:14 +0900, Fujii Masao wrote:
> Barring any objection, I'm thinking to commit this patch.

To which branches?  I am *strongly* opposed to backpatching this.


>  /*
> - * Simple signal handler for exiting quickly as if due to a crash.
> + * Simple signal handler for processes which have not yet touched or do not
> + * touch shared memory to exit quickly.
>   *
> - * Normally, this would be used for handling SIGQUIT.
> + * Note that if processes already touched shared memory, use
> + * SignalHandlerForCrashExit() instead and force the postmaster into
> + * a system reset cycle because shared memory may be corrupted.
> + */
> +void
> +SignalHandlerForNonCrashExit(SIGNAL_ARGS)
> +{
> +    /*
> +     * Since we don't touch shared memory, we can just pull the plug and exit
> +     * without running any atexit handlers.
> +     */
> +    _exit(1);
> +}

This strikes me as a quite a misleading function name. Outside of very
narrow circumstances a normal exit shouldn't use _exit(1). Neither 1 no
_exit() (as opposed to exit()) seem appropriate. This seems like a bad
idea.

Also, won't this lead to postmaster now starting to complain about
pgstat having crashed in an immediate shutdown? Right now only exit
status 0 is expected:

        if (pid == PgStatPID)
        {
            PgStatPID = 0;
            if (!EXIT_STATUS_0(exitstatus))
                LogChildExit(LOG, _("statistics collector process"),
                             pid, exitstatus);
            if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
                PgStatPID = pgstat_start();
            continue;
        }



> + * The statistics collector is started by the postmaster as soon as the
> + * startup subprocess finishes, or as soon as the postmaster is ready
> + * to accept read only connections during archive recovery.  It remains
> + * alive until the postmaster commands it to terminate.  Normal
> + * termination is by SIGUSR2 after the checkpointer must exit(0),
> + * which instructs the statistics collector to save the final statistics
> + * to reuse at next startup and then exit(0).

I can't parse "...after the checkpointer must exit(0), which instructs..."


> + * Emergency termination is by SIGQUIT; like any backend, the statistics
> + * collector will exit quickly without saving the final statistics.  It's
> + * ok because the startup process will remove all statistics at next
> + * startup after emergency termination.

Normally there won't be any stats to remove, no? The permanent stats
file has been removed when the stats collector started up.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Masahiro Ikeda
Date:
Subject: Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
Next
From: Thomas Munro
Date:
Subject: Re: MultiXact\SLRU buffers configuration