Re: Latch implementation that wakes on postmaster death on both win32 and Unix - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Latch implementation that wakes on postmaster death on both win32 and Unix
Date
Msg-id CAEYLb_UEOQr43P3VMv9nJ3ZEZNmQJ5NEcjg7PtaExcQLzr+jFg@mail.gmail.com
Whole thread Raw
In response to Re: Latch implementation that wakes on postmaster death on both win32 and Unix  (Florian Pflug <fgp@phlo.org>)
List pgsql-hackers
On 4 July 2011 22:42, Florian Pflug <fgp@phlo.org> wrote:
> If we do expect such event, we should close the hole instead of asserting.
> If we don't, then what's the point of the assert.

You can say the same thing about any assertion. I'm not going to
attempt to close the hole because I don't believe that there is one. I
would be happy to see your "read() from the pipe after select()" test
asserted though.

> BTW, do we currently retry the select() on EINTR (meaning a signal has
> arrived)? If we don't, that'd be an additional source of spurious returns
> from select.

Why might it be? WaitLatch() is currently documented to potentially
have its timeout invalidated by the process receiving a signal, which
is the exact opposite problem. We do account for this within the
archiver calling code though, and I remark upon it in a comment there.

> I'm not sure that there is currently a guarantee that PostmasterIsAlive
> will returns false immediately after select() indicates postmaster
> death. If e.g. the postmaster's parent is still running (which happens
> for example if you launch postgres via daemontools), the re-parenting of
> backends to init might not happen until the postmaster zombie has been
> vanquished by its parent's call of waitpid(). It's not entirely
> inconceivable for getppid() to then return the (dead) postmaster's pid
> until that waitpid() call has occurred.

Yes, this did occur to me - it's hard to reason about what exactly
happens here, and probably impossible to have the behaviour guaranteed
across platforms, however unlikely it seems. I'd like to hear what
Heikki has to say about asserting or otherwise verifying postmaster
death in the case of apparent postmaster death wake-up.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: Review of patch Bugfix for XPATH() if expression returns a scalar value
Next
From: Fujii Masao
Date:
Subject: Re: keepalives_* parameters usefullness