Re: Better client reporting for "immediate stop" shutdowns - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Better client reporting for "immediate stop" shutdowns
Date
Msg-id CABUevEx1Obz9cR7O+RMoHe8QFcG_iC2znof3hJ54oKN4Qvn2Kw@mail.gmail.com
Whole thread Raw
In response to Re: Better client reporting for "immediate stop" shutdowns  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Better client reporting for "immediate stop" shutdowns
List pgsql-hackers
On Tue, Dec 22, 2020 at 2:29 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Dec 22, 2020 at 3:13 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Up to now, if you shut down the database with "pg_ctl stop -m immediate"
> > then clients get a scary message claiming that something has crashed,
> > because backends can't tell whether the SIGQUIT they got was sent for
> > a crash-and-restart situation or because of an immediate-stop command.
> >
> > This isn't great from a fit-and-finish perspective, and it occurs to me
> > that it's really easy to do better: the postmaster can stick a flag
> > into shared memory explaining the reason for SIGQUIT.  While we don't
> > like the postmaster touching shared memory, there doesn't seem to be
> > any need for interlocking or anything like that, so there is no risk
> > involved that's greater than those already taken by pmsignal.c.
>
> +1 to improve the message.
>
> > So, here's a very simple proposed patch.  Some issues for possible
> > bikeshedding:
> >
> > * Up to now, pmsignal.c has only been for child-to-postmaster
> > communication, so maybe there is some better place to put the
> > support code.  I can't think of one though.
>
> +1 to have it here as we already have the required shared memory
> initialization code to add in new flags there -
> PMSignalState->sigquit_reason.
>
> If I'm correct, quickdie() doesn't access any shared memory because
> one of the reason we can be in quickdie() is when the shared memory
> itself is corrupted(the comment down below on why we don't call
> roc_exit() or atexit() says), in such case, will GetQuitSignalReason()
> have some problem in accessing the shared memory i.e. +    return
> PMSignalState->sigquit_reason;?
>
> > * I chose to report the same message for immediate shutdown as we
> > already use for SIGTERM (fast shutdown or pg_terminate_backend()).
> > Should it be different, and if so what?
>
> AFAIK, errmsg(terminating connection due to administrator command") is
> emitted when there's no specific reason. But we know exactly why we
> are terminating in this case, how about having an error message like
> errmsg("terminating connection due to immediate shutdown request")));
> ? There are other places where errmsg("terminating connection due to
> XXXX reasons"); is used. We also log messages when an immediate
> shutdown request is received errmsg("received immediate shutdown
> request").

+1. I definitely think having this message be different can be useful.

See also the thread about tracking shutdown reasons (connection
statistics) -- not the same thing, but the same concepts apply.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: Parallel Inserts in CREATE TABLE AS
Next
From: Li Japin
Date:
Subject: Confused about stream replication protocol documentation