Thread: Unix latch implementation that wakes on postmaster death

Unix latch implementation that wakes on postmaster death

From
Peter Geoghegan
Date:
Attached is a patch that builds upon Florian Pflug's earlier proof of
concept program for monitoring the postmaster. The code creates a
non-blocking pipe in the postmaster that child processes block on
using a select() call. This all occurs in the latch code, which now
monitors postmaster death, but only for clients that request it (and,
almost invariably in addition to monitoring other things, like having
a timeout occur or a latch set).

I've implemented an interface originally sketched by Heikki that
allows clients to specify events to wake on, and to see what event
actually caused the wakeup when we're done by bitwise AND'ing the
returned int against various new bitmasks.

I've included my existing changes to the archiver as a convenience to
anyone that wants to quickly see the effects of the patch in action;
even though we don't have a tight loop that polls PostmasterIsAlive()
every second, we still wake up on postmaster death, so there is no
potential denial of service as previously described by Tom. This can
be easily observed by sending the postmaster SIGKILL while the
archiver is on - the archiver immediately finishes. Note that I've
deferred changing the existing call sites of WaitLatch()/
WaitLatchOrSocket(), except to make them use the new interface. Just
as before, they don't ask to be woken on postmaster death, even though
in some cases they probably should. Whether or not they should and how
they should are questions for another day though.

I expect that this patch will be split into two separate patches: The
latch patch (complete with currently missing win32 implementation) and
the archiver patch. For now, I'd like to hear thoughts on how I've
implemented the extra latch functionality.

How should I be handling the EXEC_BACKEND case?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment

Re: Unix latch implementation that wakes on postmaster death

From
Robert Haas
Date:
On Fri, May 13, 2011 at 8:06 AM, Peter Geoghegan <peter@2ndquadrant.com> wrote:
> Attached is a patch that builds upon Florian Pflug's earlier proof of
> concept program for monitoring the postmaster. The code creates a
> non-blocking pipe in the postmaster that child processes block on
> using a select() call. This all occurs in the latch code, which now
> monitors postmaster death, but only for clients that request it (and,
> almost invariably in addition to monitoring other things, like having
> a timeout occur or a latch set).
>
> I've implemented an interface originally sketched by Heikki that
> allows clients to specify events to wake on, and to see what event
> actually caused the wakeup when we're done by bitwise AND'ing the
> returned int against various new bitmasks.
>
> I've included my existing changes to the archiver as a convenience to
> anyone that wants to quickly see the effects of the patch in action;
> even though we don't have a tight loop that polls PostmasterIsAlive()
> every second, we still wake up on postmaster death, so there is no
> potential denial of service as previously described by Tom. This can
> be easily observed by sending the postmaster SIGKILL while the
> archiver is on - the archiver immediately finishes. Note that I've
> deferred changing the existing call sites of WaitLatch()/
> WaitLatchOrSocket(), except to make them use the new interface. Just
> as before, they don't ask to be woken on postmaster death, even though
> in some cases they probably should. Whether or not they should and how
> they should are questions for another day though.

I don't immediately have time to look at this, but it sounds awesome!
Thank you very much for working on this!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Unix latch implementation that wakes on postmaster death

From
Tom Lane
Date:
Peter Geoghegan <peter@2ndquadrant.com> writes:
> Attached is a patch that builds upon Florian Pflug's earlier proof of
> concept program for monitoring the postmaster.

Cool.  Like Robert, no time to review this in detail now, but ...

> How should I be handling the EXEC_BACKEND case?

Assuming that the open pipe descriptor is inherited across exec on
Windows (and if it's not, we're back to square one) all you should
have to do is get the pipe descriptor variables passed down to the
child processes.  There's some grotty code in postmaster.c that's
used for this purpose --- see struct BackendParameters and associated
functions.  Just add some code there to pass down the values.

I'm not that thrilled with the "life sign" terminology, but don't
have a better idea right offhand.
        regards, tom lane


Re: Unix latch implementation that wakes on postmaster death

From
Robert Haas
Date:
On Fri, May 13, 2011 at 10:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not that thrilled with the "life sign" terminology, but don't
> have a better idea right offhand.

Yeah, that made no sense to me.  Can't we just refer to detecting
postmaster death?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Unix latch implementation that wakes on postmaster death

From
Peter Geoghegan
Date:
On 13 May 2011 16:18, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, May 13, 2011 at 10:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm not that thrilled with the "life sign" terminology, but don't
>> have a better idea right offhand.
>
> Yeah, that made no sense to me.  Can't we just refer to detecting
> postmaster death?

Fine by me.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services