Hi,
On 2021-03-24 18:36:14 +0900, Fujii Masao wrote:
> Barring any objection, I'm thinking to commit this patch.
To which branches? I am *strongly* opposed to backpatching this.
> /*
> - * Simple signal handler for exiting quickly as if due to a crash.
> + * Simple signal handler for processes which have not yet touched or do not
> + * touch shared memory to exit quickly.
> *
> - * Normally, this would be used for handling SIGQUIT.
> + * Note that if processes already touched shared memory, use
> + * SignalHandlerForCrashExit() instead and force the postmaster into
> + * a system reset cycle because shared memory may be corrupted.
> + */
> +void
> +SignalHandlerForNonCrashExit(SIGNAL_ARGS)
> +{
> + /*
> + * Since we don't touch shared memory, we can just pull the plug and exit
> + * without running any atexit handlers.
> + */
> + _exit(1);
> +}
This strikes me as a quite a misleading function name. Outside of very
narrow circumstances a normal exit shouldn't use _exit(1). Neither 1 no
_exit() (as opposed to exit()) seem appropriate. This seems like a bad
idea.
Also, won't this lead to postmaster now starting to complain about
pgstat having crashed in an immediate shutdown? Right now only exit
status 0 is expected:
if (pid == PgStatPID)
{
PgStatPID = 0;
if (!EXIT_STATUS_0(exitstatus))
LogChildExit(LOG, _("statistics collector process"),
pid, exitstatus);
if (pmState == PM_RUN || pmState == PM_HOT_STANDBY)
PgStatPID = pgstat_start();
continue;
}
> + * The statistics collector is started by the postmaster as soon as the
> + * startup subprocess finishes, or as soon as the postmaster is ready
> + * to accept read only connections during archive recovery. It remains
> + * alive until the postmaster commands it to terminate. Normal
> + * termination is by SIGUSR2 after the checkpointer must exit(0),
> + * which instructs the statistics collector to save the final statistics
> + * to reuse at next startup and then exit(0).
I can't parse "...after the checkpointer must exit(0), which instructs..."
> + * Emergency termination is by SIGQUIT; like any backend, the statistics
> + * collector will exit quickly without saving the final statistics. It's
> + * ok because the startup process will remove all statistics at next
> + * startup after emergency termination.
Normally there won't be any stats to remove, no? The permanent stats
file has been removed when the stats collector started up.
Greetings,
Andres Freund