Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Date
Msg-id 33791957-ABB9-437A-B82E-7ED8420D16FC@phlo.org
Whole thread Raw
In response to Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Feb14, 2014, at 14:07 , Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-02-14 13:52:45 +0100, Florian Pflug wrote:
>>> I agree we should do that, but imo not in the backbranches. Anything
>>> more than than the minimal fix in that code should be avoided in the
>>> stable branches, this stuff is friggin performance sensitive, and the
>>> spinlock already is a *major* contention point in many workloads.
>>
>> No argument there. But unless I missed something, there actually is a bug
>> there, and using volatile won't fix it. A barrier would, but what about the
>> back branches that don't have barrier.h?
>
> Yea, but I don't see a better alternative. Realistically the likelihood
> of a problem without the compiler reordering stuff is miniscule on any
> relevant platform that's supported by the !barrier.h branches. Newer
> ARMs are the only realistic suspect, and the support for in older
> releases wasn't so good...

Isn't POWER/PowerPC potentially affected by this?

Also, there is still the alternative of not resetting lwWaitLink in LWLockRelease.
I can understand why you oppose that - it's certainly nicer to not have stray
pointer lying around. But since (as least as far as we know)

a) resetting lwWaitLink is not strictly necessary
b) resetting lwWaitLink introduces a race condition (however unlikely)

we'll trade correctness for cleanliness if we continue to reset lwWaitLink
without protecting against the race. That's a bit of a weird trade-off to make.

Another idea for a fix would be to conflate lwWaiting and lwWaitLink into one
field. We could replace "lwWaiting" by "lwWaitLink != NULL" everywhere it's
tested, and set lwWaitLink to some special non-NULL value (say 0x1) when we
enqueue a PGPROC, instead of setting it to NULL and setting lwWaiting to true.

We'd then depend on pointer-sized stores being atomic, which I think we depend
on in other places already.

best regards,
Florian Pflug




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: walsender doesn't send keepalives when writes are pending
Next
From: David Beck
Date:
Subject: Re: New hook after raw parsing, before analyze