On 07/29/2015 09:35 PM, Andres Freund wrote:
> On 2015-07-29 20:23:24 +0300, Heikki Linnakangas wrote:
>> Backend A has called LWLockWaitForVar(X) on a lock, and is now waiting on
>> it. The lock holder releases the lock, and wakes up A. But before A wakes up
>> and sees that the lock is free, another backend acquires the lock again. It
>> runs LWLockAcquireWithVar to the point just before setting the variable's
>> value. Now A wakes up, sees that the lock is still (or again) held, and that
>> the variable's value still matches the old one, and goes back to sleep. The
>> new lock holder won't wake it up until it updates the value again, or to
>> releases the lock.
>
> I'm not sure whether this observation is about my patch or the general
> lwlock variable mechanism. In my opinion that behaviour exists today
> both in 9.4 and 9.5.
In 9.4, LWLockAcquire holds the spinlock when it marks the lock as held,
until it has updated the variable. And LWLockWaitForVar() holds the
spinlock when it checks that the lock is held and that the variable's
value matches. So it cannot happen on 9.4.
To reiterate, with 9.5, it's possible that a backend is sleeping in
LWLockWaitForVar(oldvar=123), even though the lock is currently held by
another backend with value 124. That seems wrong, or surprising at the
very least.
> But I think that's fine because that "race" seems pretty
> fundamental. After all, you could have called LWLockWaitForVar() just
> after the second locker had set the variable to the same value.
I'm not talking about setting it to the same value. I'm talking about
setting it to a different value. (I talked about setting it to the same
value later in the email, and I agree that's a pretty fundamental
problem and exists with 9.4 as well).
- Heikki