Re: kill -KILL: What happens? - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: kill -KILL: What happens?
Date
Msg-id 7A2978E1-ADC5-46BF-96CC-24AF808507D6@phlo.org
Whole thread Raw
In response to Re: kill -KILL: What happens?  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: kill -KILL: What happens?  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Jan14, 2011, at 17:45 , Robert Haas wrote:
> On Fri, Jan 14, 2011 at 11:28 AM, Florian Pflug <fgp@phlo.org> wrote:
>> I gather that the behaviour we want is for normal backends to exit
>> once the postmaster is gone, and for utility processes (bgwriter, ...)
>> to exit once all the backends are gone.
>>
>> The test program I posted in this thread proves that FIFOs and select()
>> can be used to implement this, if we're ready to check for EOF on the
>> socket in CHECK_FOR_INTERRUPTS() every few seconds. Is this a viable
>> route to take?
>
> I don't think there's much point in getting excited about the order in
> which things exit.  If we're agreed (and we seem to be, modulo Tom)
> that the backends should exit quickly if the postmaster dies, then
> worrying about whether the utility processes exit slightly before or
> slightly after that doesn't excite me very much.

I've realized that POSIX actually *does* provide a way to receive a signal -
the SIGIO machinery. I've modified my test case do to that. To simplify things,
I've removed support for multiple life sign objects.

The code now does the following:

The parents creates a pipe, sets it's reading fd to O_NONBLOCK and O_ASYNC,
and registers a SIGIO handler. The SIGIO handler checks a global flag, and
simply sends a SIGTERM to its own pid if the flag is set.

Child processes close the pipe's writing end (called "giving up ownership
of the life sign" in the code) and set the global flag if they want to receive
a SIGTERM once the parent is gone. The parent's health state can additionally
be checked at any time by trying to read() from the pipe. read() returns
EAGAIN as long as the parent is still alive and EOF otherwise.

I'm not sure how portable this is. It compiles and runs fine on both my linux
machine (Ubuntu 10.04.01 LTS) and my laptop (OSX 10.6.6).

In the EXEC_BACKEND case the pipe would need to be created with mkfifo() in
the data directory, but otherwise things should work the same. Haven't tried
that yet, though.

Code attached. The output should be

Launched backend 8636
Launched backend 8637
Launched backend 8638
Backend 8636 detected live parent
Backend 8637 detected live parent
Backend 8638 detected live parent
Backend 8636 detected live parent
Backend 8637 detected live parent
Backend 8638 detected live parent
Parent exiting
Backend 8637 exiting after parent died
Backend 8638 exiting after parent died
Backend 8636 exiting after parent died

if things work correctly.

best regards,
Florian Pflug

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug in pg_describe_object, patch v2
Next
From: Tom Lane
Date:
Subject: Re: Streaming base backups