Re: Latch implementation that wakes on postmaster death on both win32 and Unix - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Latch implementation that wakes on postmaster death on both win32 and Unix
Date
Msg-id 7867C59B-E22A-4C25-8B5E-65AE5ECAF4C9@phlo.org
Whole thread Raw
In response to Re: Latch implementation that wakes on postmaster death on both win32 and Unix  (Peter Geoghegan <peter@2ndquadrant.com>)
Responses Re: Latch implementation that wakes on postmaster death on both win32 and Unix
List pgsql-hackers
On Jul8, 2011, at 11:57 , Peter Geoghegan wrote:
> On 7 July 2011 19:15, Robert Haas <robertmhaas@gmail.com> wrote:
>>> I'm not concerned about the possibility of spurious extra cycles of
>>> auxiliary process event loops - should I be?
>>
>> A tight loop would be bad, but an occasional spurious wake-up seems harmless.
>
> We should also assert !PostmasterIsAlive() from within the latch code
> after waking due to apparent Postmaster death. The reason that I don't
> want to follow Florian's suggestion to check it in production is that
> I don't know what to do if the postmaster turns out to be alive. Why
> is it more reasonable to try again than to just return?

I'd say return, but don't indicate postmaster death in the return value
if PostmasterIsAlive() returns true. Or don't call PostmasterIsAlive() in
WaitLatch(), and return indicating postmaster death whenever select()
says so, and put the burden of re-checking on the callers.

I agree that retrying isn't all that reasonable.

> If the
> spurious wake-up thing was a problem that we could actually reproduce,
> then maybe I'd have an opinion on it. As it stands, our entire basis
> for thinking this may be a problem is the sentence "There may be other
> circumstances in which a file descriptor is spuriously reported as
> ready". That seems rather flimsy.

Flimsy or not, it pretty clearly warns us not to depend on there being
no spurious wake ups. Whether or not we know how to actually produce
there is IMHO largely irrelevant - what matters is whether the guarantees
given by select() match the expectations of our code. Which, according to
the cited passage, they currently don't.

> Anyone that still has any misgivings about this will probably feel
> better once the assertion is never reported to fail on any of the
> diverse systems that PostgreSQL will be tested on in advance of the
> 9.2 release.

I'm not so convinced that WaitLatch() will get exercised much on
assert-enabled builds. But I might very well be wrong there...

best regards,
Florian Pflug



pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: spinlock contention
Next
From: "Kevin Grittner"
Date:
Subject: Re: [COMMITTERS] pgsql: Adjust OLDSERXID_MAX_PAGE based on BLCKSZ.