Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Date
Msg-id 20130201140430.GC6915@awork2.anarazel.de
Whole thread Raw
In response to Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2013-02-01 08:55:24 -0500, Peter Eisentraut wrote:
> On 1/31/13 5:42 PM, MauMau wrote:
> > Thank you for sharing your experience.  So you also considered making
> > postmaster SIGKILL children like me, didn't you?  I bet most of people
> > who encounter this problem would feel like that.
> > 
> > It is definitely pg_ctl who needs to be prepared, not the users.  It may
> > not be easy to find out postgres processes to SIGKILL if multiple
> > instances are running on the same host.  Just doing "pkill postgres"
> > will unexpectedly terminate postgres of other instances.
> 
> In my case, it was one backend process segfaulting, and then some other
> backend processes didn't respond to the subsequent SIGQUIT sent out by
> the postmaster.  So pg_ctl didn't have any part in it.
> 
> We ended up addressing that by installing a nagios event handler that
> checked for this situation and cleaned it up.
> 
> > I would like to make a patch which that changes SIGQUIT to SIGKILL when
> > postmaster terminates children.  Any other better ideas?
> 
> That was my idea back then, but there were some concerns about it.
> 
> I found an old patch that I had prepared for this, which I have
> attached.  YMMV.

> +static void
> +quickdie_alarm_handler(SIGNAL_ARGS)
> +{
> +    /*
> +     * We got here if ereport() was blocking, so don't go there again
> +     * except when really asked for.
> +     */
> +    elog(DEBUG5, "quickdie aborted by alarm");
> +

Its probably not wise to enter elog.c again, that path might allocate
memory and we wouldn't be any wiser. Unfortunately there's not much
besides a write(2) to stderr that can safely be done...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: enable new error fields in plpgsql (9.4)
Next
From: Alvaro Herrera
Date:
Subject: Re: [PATCH] HOT on tables with oid indexes broken