Re: LWLock deadlock and gdb advice - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: LWLock deadlock and gdb advice
Date
Msg-id 55BA3684.20906@iki.fi
Whole thread Raw
In response to Re: LWLock deadlock and gdb advice  (Andres Freund <andres@anarazel.de>)
Responses Re: LWLock deadlock and gdb advice
List pgsql-hackers
On 07/29/2015 09:35 PM, Andres Freund wrote:
> On 2015-07-29 20:23:24 +0300, Heikki Linnakangas wrote:
>> Backend A has called LWLockWaitForVar(X) on a lock, and is now waiting on
>> it. The lock holder releases the lock, and wakes up A. But before A wakes up
>> and sees that the lock is free, another backend acquires the lock again. It
>> runs LWLockAcquireWithVar to the point just before setting the variable's
>> value. Now A wakes up, sees that the lock is still (or again) held, and that
>> the variable's value still matches the old one, and goes back to sleep. The
>> new lock holder won't wake it up until it updates the value again, or to
>> releases the lock.
>
> I'm not sure whether this observation is about my patch or the general
> lwlock variable mechanism. In my opinion that behaviour exists today
> both in 9.4 and 9.5.

In 9.4, LWLockAcquire holds the spinlock when it marks the lock as held, 
until it has updated the variable. And LWLockWaitForVar() holds the 
spinlock when it checks that the lock is held and that the variable's 
value matches. So it cannot happen on 9.4.

To reiterate, with 9.5, it's possible that a backend is sleeping in 
LWLockWaitForVar(oldvar=123), even though the lock is currently held by 
another backend with value 124. That seems wrong, or surprising at the 
very least.

> But I think that's fine because that "race" seems pretty
> fundamental. After all, you could have called LWLockWaitForVar() just
> after the second locker had set the variable to the same value.

I'm not talking about setting it to the same value. I'm talking about 
setting it to a different value. (I talked about setting it to the same 
value later in the email, and I agree that's a pretty fundamental 
problem and exists with 9.4 as well).

- Heikki



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: 64-bit XIDs again
Next
From: Alexander Korotkov
Date:
Subject: Re: 64-bit XIDs again