Home > mailing lists

Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks) - Mailing list pgsql-hackers

From	MauMau
Subject	Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
Date	June 20, 2013 22:40:04
Msg-id	72451F7353F64752996F752DE76E856F@maumau Whole thread
In response to	Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks) (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses	Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks) Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks) Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)
List	pgsql-hackers

Tree view

From: "Alvaro Herrera" <alvherre@2ndquadrant.com>
> I will go with 5 seconds, then.

OK, I agree.


> My point is that there is no difference.  For one thing, once we enter
> immediate shutdown state, and sigkill has been sent, no further action
> is taken.  Postmaster will just sit there indefinitely until processes
> are gone.  If we were to make it repeat SIGKILL until they die, that
> would be different.  However, repeating SIGKILL is pointless, because it
> they didn't die when they first received it, they will still not die
> when they receive it second.  Also, if they're in uninterruptible sleep
> and don't die, then they will die as soon as they get out of that state;
> no further queries will get processed, no further memory access will be
> done.  So there's no harm in they remaining there until underlying
> storage returns to life, ISTM.
>
>> Here, "reliable" means that the database server is certainly shut
>> down when pg_ctl returns, not telling a lie that "I shut down the
>> server processes for you, so you do not have to be worried that some
>> postgres process might still remain and write to disk".  I suppose
>> reliable shutdown is crucial especially in HA cluster.  If pg_ctl
>> stop -mi gets stuck forever when there is an unkillable process (in
>> what situations does this happen? OS bug, or NFS hard mount?), I
>> think the DBA has to notice this situation from the unfinished
>> pg_ctl, investigate the cause, and take corrective action.
>
> So you're suggesting that keeping postmaster up is a useful sign that
> the shutdown is not going well?  I'm not really sure about this.  What
> do others think?

I think you are right, and there is no harm in leaving postgres processes in 
unkillable state.  I'd like to leave the decision to you and/or others.

One concern is that umount would fail in such a situation because postgres 
has some open files on the filesystem, which is on the shared disk in case 
of traditional HA cluster.  However, STONITH should resolve the problem by 
terminating the stuck node...  I just feel it is strange for umount to fail 
due to remaining postgres, because pg_ctl stop -mi reported success.

> IIRC the only other interesting tweak I did was rename the
> SignalAllChildren() function to TerminateChildren().  I did this because
> it doesn't really signal all children; syslogger and dead_end backends
> are kept around.  So the original name was a bit misleading.  And we
> couldn't really name it SignalAlmostAllChildren(), could we ..

I see.  thank you.

Regards
MauMau

pgsql-hackers by date:

From: Merlin Moncure
Date: 20 June 2013, 20:33:29
Subject: Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

From: Kevin Grittner
Date: 20 June 2013, 22:57:16
Subject: Re: changeset generation v5-01 - Patches & git tree

Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks) - Mailing list pgsql-hackers

Previous

Next