Re: Something is fairly whacko about shutdown in CVS HEAD - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Something is fairly whacko about shutdown in CVS HEAD
Date
Msg-id 2308.1183257734@sss.pgh.pa.us
Whole thread Raw
In response to Re: Something is fairly whacko about shutdown in CVS HEAD  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Something is fairly whacko about shutdown in CVS HEAD  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> I'm seeing two sets of shutdown messages, and apparently a second
>> shutdown checkpoint being forced, during a normal database stop:

> Huh, I can't reproduce it here.

It looks to me like this is a race condition induced by the
autovacuum-launcher patches.  Observe the following chunk of
postmaster.c, which responds to exit of the bgwriter child:
       /*        * Was it the bgwriter?        */       if (BgWriterPID != 0 && pid == BgWriterPID)       {
BgWriterPID= 0;           if (EXIT_STATUS_0(exitstatus) &&               Shutdown > NoShutdown && !FatalError &&
      !DLGetHead(BackendList) && AutoVacPID == 0)           {               /*                * Normal postmaster exit
ishere: we've seen normal exit of                * the bgwriter after it's been told to shut down. We expect
   * that it wrote a shutdown checkpoint.  (If for some reason                * it didn't, recovery will occur on next
postmasterstart.)                *                * Note: we do not wait around for exit of the archiver or
  * stats processes.  They've been sent SIGQUIT by this point,                * and in any case contain logic to commit
hara-kiriif they                * notice the postmaster is gone.                */               ExitPostmaster(0);
     }
 
           /*            * Any unexpected exit of the bgwriter (including FATAL exit)            * is treated as a
crash.           */           HandleChildCrash(pid, exitstatus,                            _("background writer
process"));

If AutoVacPID is still nonzero when bgwriter exit is detected,
then we think we've seen a crash.  I'm not clear why it happens
reliably for me and not for you, but this is certainly a bug.

To resolve this I think we need a clearer definition of the autovac
launcher's role in life.  I see that it is attached to shared memory;
is it supposed to be able to execute transactions or otherwise do
anything the bgwriter might have to clean up after?  If so we need
to fix things so that we don't tell the bgwriter to exit until after
the launcher is gone.  If not, we could possibly allow these things
to happen asynchronously, though I wonder whether it wouldn't be best
to force the ordering anyway.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Something is fairly whacko about shutdown in CVS HEAD
Next
From: Alvaro Herrera
Date:
Subject: Re: Something is fairly whacko about shutdown in CVS HEAD