Latch implementation that wakes on postmaster death on both win32 and Unix - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Latch implementation that wakes on postmaster death on both win32 and Unix
Date
Msg-id BANLkTinMfW4FPG_Cbbc6J9CiJAvCWCHMXw@mail.gmail.com
Whole thread Raw
Responses Re: Latch implementation that wakes on postmaster death on both win32 and Unix  (Peter Geoghegan <peter@2ndquadrant.com>)
Re: Latch implementation that wakes on postmaster death on both win32 and Unix  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Attached is the latest revision of the latch implementation that
monitors postmaster death, plus the archiver client that now relies on
that new functionality and thereby works well without a tight
PostmasterIsAlive() polling loop.

On second thought, it is reasonable for the patch to be evaluated with
the archiver changes. Any problems that we'll have with latch changes
are likely problems that all WL_POSTMASTER_DEATH latch clients will
have, so we might as well include the simplest such client initially.
Once I have buy-in on the latch changes, the archiver work becomes
uncontroversial, I think.

The lifesign terminology has been dropped. We now close() the file
descriptor that represents "ownership" - the write end of our
anonymous pipe - in each child backend directly in the forking
machinery (the thin fork() wrapper for the non-EXEC_BACKEND case),
through a call to ReleasePostmasterDeathWatchHandle(). We don't have
to do that on Windows, and we don't.

I've handled the non-win32 EXEC_BACKEND case, which I understand just
exists for testing purposes. I've done the usual BackendParameters
stuff.

A ReleasePostmasterDeathWatchHandle() call is unnecessary on win32
(the function doesn't exist there - the need to call it on Unix is a
result of its implementation). I'd like to avoid having calls to it in
each auxiliary process. It should be called in a single sweet spot
that doesn't put any burden on child process authors to remember to
call it themselves.

Disappointingly, and despite a big effort, there doesn't seem to be a
way to have the win32 WaitForMultipleObjects() call wake on postmaster
death in addition to everything else in the same way that select()
does, so there are now two blocking calls, each in a thread of its own
(when the latch code is interested in postmaster death - otherwise,
it's single threaded as before).

The threading stuff (in particular, the fact that we used a named pipe
in a thread where the name of the pipe comes from the process PID) is
inspired by win32 signal emulation, src/backend/port/win32/signal.c .

You can easily observe that it works as advertised on Windows by
starting Postgres with archiving, using task manager to monitor
processes, and doing the following to the postmaster (assuming it has
a PID of 1234). This is the Windows equivalent of kill -9 :

C:\Users\Peter>taskkill /pid 1234 /F

You'll see that it takes about a second for the archiver to exit. All
processes exit.

Thoughts?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: minor patch submission: CREATE CAST ... AS EXPLICIT
Next
From: Tom Lane
Date:
Subject: Should partial dumps include extensions?