Re: Why has postmaster shutdown gotten so slow? - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: Why has postmaster shutdown gotten so slow?
Date
Msg-id 402418FF.7090408@Yahoo.com
Whole thread Raw
In response to Re: Why has postmaster shutdown gotten so slow?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
>> I checked the background writer for this and I can not reproduce the 
>> behaviour. If the bgwriter had zero blocks to write it does PG_USLEEP 
>> for 10 seconds, which on Unix is done by select() and that is correctly 
>> interrupted when the postmaster sends it the term signal on shutdown.
> 
> This appears to be a platform-dependent behavior.  The HPUX select(2) man
> page says
> 
>           [EINTR]        The select() function was interrupted before any
>                          of the selected events occurred and before the
>                          timeout interval expired. If SA_RESTART has been
>                          set for the interrupting signal, it is
>                          implementation-dependent whether select() restarts
>                          or returns with EINTR.
> 
> which text also appears verbatim in the Single Unix Spec.  Since we set
> SA_RESTART for every signal except SIGALRM (see pqsignal.c), we are
> subject to the implementation dependency for SIGTERM.

That explains it.

> 
> Tracing the bgwriter process on my machine makes it real obvious that in
> fact the select delay is allowed to finish out when SIGTERM is received.
> In fact worse than that: it's restarted from the beginning.  If 5
> seconds have already elapsed, another 10 still elapse before the select
> exits.
> 
> This won't do :-(.  We cannot afford to fritter away 10 seconds in the
> SIGTERM shutdown cycle --- on typical systems init isn't going to give
> us more than 20 seconds before a hard kill.
> 
> I'd suggest reducing the delay to a second or two, or perhaps breaking
> it into several 1-second waits with interrupt flag checks between.
> 
> In the longer run we might want to rethink what we are doing with
> SA_RESTART, but I am not sure about the implications of fooling with
> that.

I think we should at this point have some maximum value for PG_xSLEEP 
over which it falls back to a function call that does either this 
breaking up into a loop with checking InterruptPending or removes the 
SA_RESTART flag while wating for the timeout.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: bug in substring???
Next
From: "scott.marlowe"
Date:
Subject: Re: bug in substring???