Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
Date
Msg-id 8918EF09-4727-4896-AD34-3DAEC3063883@phlo.org
Whole thread Raw
In response to Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease
List pgsql-hackers
On Feb14, 2014, at 19:21 , Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-02-14 18:49:33 +0100, Florian Pflug wrote:
>> Well, the assumption isn't all that new. We already have the situation that
>> a PGPROC may be not on any wait queue, yet its lwWaitLink may be non-NULL.
>> Currently, the process who took it off the queue is responsible for rectifying
>> that eventually, with the changed proposed below it'll be the backend who
>> owns the PGPROC. From the point of view of other backends, nothing really
>> changes.
>
> That window is absolutely tiny tho.

True, but it's there, so if anything counts on that never being the case, it's
still already broken.

>>>> b) resetting lwWaitLink introduces a race condition (however unlikely)
>>>>
>>>> we'll trade correctness for cleanliness if we continue to reset lwWaitLink
>>>> without protecting against the race. That's a bit of a weird trade-off to make.
>>>
>>> It's not just cleanliness, it's being able to actually debug crashes.
>>
>> We could still arrange for the stray lwWaitLink from being visible only
>> momentarily. If a backend is woken up and detects that lwWaiting is false,
>> it knows that it cannot be on any wait queue - it was just removed, and
>> hasn't added itself again yet. At that point, it's safe to reset lwWaitLink
>> back to NULL. The stray value would thus only be visible while a backend executes
>> the PGSemaphoreLock() loop, and whether or not this is the case can be deduced
>> from a stack trace. So I'm not convinced that this makes debugging any harder.
>
> That's not actually safe on an out of order architecture afaics. Without
> barriers the store to lwWaitLink in the woken up backend can preempt the
> read for the next element in the PGSemaphoreUnlock() loop.

Hm, true. I guess we could prevent that by introducing an artificial dependency
between the read and the write - something like
 proc = head; head = proc->lwWaitLink /*  * We don't ever expect to actually PANIC here, but the test forces the  *
loadto execute before the store to lwWaiting. This prevents a race  * between reading lwWaitLink here and resetting it
backto zero in the  * awoken backend (Note that backends can wake up spuriously, so just  * reading it before doing
PGSemaphoreUnlockis insufficient)  */ if (head != MyProc)   proc->lwWaiting = false; else   elog(PANIC, ...)
PGSemaphoreUnlock(&proc->sem);

(This assumes that proc is a volatile pointer)

Another idea would be to do as you suggest and only mark the PGPROC pointers
volatile, but to additionally add a check for queue corruption somewhere. We should
be able to detect that - if we ever hit this issue, LWLockRelease should find a
PGPROC while traversing the queue whose lwWaitLink is NULL but which isn't equal to
lock->tail. If that ever happens, we'd simply PANIC.

best regards,
Florian Pflug






pgsql-hackers by date:

Previous
From: digoal
Date:
Subject: Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding
Next
From: Tom Lane
Date:
Subject: Re: Recovery inconsistencies, standby much larger than primary