Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Date
Msg-id 20140214155118.GC20375@awork2.anarazel.de
Whole thread Raw
In response to Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  (Florian Pflug <fgp@phlo.org>)
Responses Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  (knizhnik <knizhnik@garret.ru>)
Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  (Florian Pflug <fgp@phlo.org>)
List pgsql-hackers
On 2014-02-14 15:03:16 +0100, Florian Pflug wrote:
> On Feb14, 2014, at 14:07 , Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-02-14 13:52:45 +0100, Florian Pflug wrote:
> >>> I agree we should do that, but imo not in the backbranches. Anything
> >>> more than than the minimal fix in that code should be avoided in the
> >>> stable branches, this stuff is friggin performance sensitive, and the
> >>> spinlock already is a *major* contention point in many workloads.
> >> 
> >> No argument there. But unless I missed something, there actually is a bug
> >> there, and using volatile won't fix it. A barrier would, but what about the
> >> back branches that don't have barrier.h?
> > 
> > Yea, but I don't see a better alternative. Realistically the likelihood
> > of a problem without the compiler reordering stuff is miniscule on any
> > relevant platform that's supported by the !barrier.h branches. Newer
> > ARMs are the only realistic suspect, and the support for in older
> > releases wasn't so good...
> 
> Isn't POWER/PowerPC potentially affected by this? 

I wouldn't consider it a major architecture... And I am not sure how
much out of order a CPU has to be to be affected by this, the read side
uses spinlocks, which in most of the spinlock implementations implies a
full memory barrier which in many cache coherency designs will cause
other CPUs to flush writes. And I think the control dependency in the
PGSemaphoreUnlock() loop will actually cause a flush on ppc...

> Also, there is still the alternative of not resetting lwWaitLink in LWLockRelease.
> I can understand why you oppose that - it's certainly nicer to not have stray
> pointer lying around. But since (as least as far as we know)
> 
> a) resetting lwWaitLink is not strictly necessary

I don't want to rely on that in the backbranches, that's an entirely new
assumption. I think anything more than minimal changes are hard to
justify for a theoretical issue that hasn't been observed in the field.

I think the biggest danger here is that writes are reordered by the
compiler, that we definitely need to protect against. Making a variable
volatile or introducing a memory barrier is pretty simple to analyze.

> b) resetting lwWaitLink introduces a race condition (however unlikely)
> 
> we'll trade correctness for cleanliness if we continue to reset lwWaitLink
> without protecting against the race. That's a bit of a weird trade-off to make.

It's not just cleanliness, it's being able to actually debug crashes.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Next
From: Stephen Frost
Date:
Subject: Re: HBA files w/include support?