Re: [RFC] Should we fix postmaster to avoid slow shutdown? - Mailing list pgsql-hackers

From Tsunakawa, Takayuki
Subject Re: [RFC] Should we fix postmaster to avoid slow shutdown?
Date
Msg-id 0A3221C70F24FB45833433255569204D1F653D1B@G01JPEXMBYT05
Whole thread Raw
In response to Re: [RFC] Should we fix postmaster to avoid slow shutdown?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [RFC] Should we fix postmaster to avoid slow shutdown?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
From: Robert Haas [mailto:robertmhaas@gmail.com]
> On Fri, Nov 18, 2016 at 4:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> >> Tom Lane wrote:
> >>> IMO it's not, and closer analysis says that this patch series is an
> >>> attempt to solve something we already fixed, better, in 9.4.
> >
> >> ... by the same patch submitter.
> >
> > [ confused ]  The commit log credits 82233ce7e to MauMau and yourself.
> 
> IIUC, MauMau = Tsunakawa Takayuki.

Yes, it's me.  I'm pleased that you remember me!

First, I understand that zapping the stats file during recovery can be a problem.  In fact, it's me who proposed adding
asentence in the manual that the stats file is reset after immediate shutdown.  I think addressing this problem is
anothertopic in a new thread.
 

The reasons why I proposed this patch are:

* It happened in a highly mission-critical production system of a customer who uses 9.2.

* 9.4's solution is not perfect, because it wastes 5 seconds anyway, which is unexpected for users.  The customer's
requirementincludes failover within 30 seconds, so 5 seconds can be seen as a risk.
 
Plus, I'm worried about the possibility that the SIGKILLed process wouldn't disappear if it's writing to a network
storagelike NFS.
 

* And first of all, the immediate shutdown should shut the server down immediately without doing anything heavy, as the
namemeans.
 

So, I think this patch should also be applied to later releases.  The purpose of the patch in 9.4 was to avoid
PostgreSQL'sbug, where the ereport() in quickdie() gets stuck waiting for malloc()'s lock to be released.
 

Regards
Takayuki Tsunakawa


pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: WAL recycle retading based on active sync rep.
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: WAL recycle retading based on active sync rep.