On Feb14, 2014, at 19:21 , Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-02-14 18:49:33 +0100, Florian Pflug wrote:
>> Well, the assumption isn't all that new. We already have the situation that
>> a PGPROC may be not on any wait queue, yet its lwWaitLink may be non-NULL.
>> Currently, the process who took it off the queue is responsible for rectifying
>> that eventually, with the changed proposed below it'll be the backend who
>> owns the PGPROC. From the point of view of other backends, nothing really
>> changes.
>
> That window is absolutely tiny tho.
True, but it's there, so if anything counts on that never being the case, it's
still already broken.
>>>> b) resetting lwWaitLink introduces a race condition (however unlikely)
>>>>
>>>> we'll trade correctness for cleanliness if we continue to reset lwWaitLink
>>>> without protecting against the race. That's a bit of a weird trade-off to make.
>>>
>>> It's not just cleanliness, it's being able to actually debug crashes.
>>
>> We could still arrange for the stray lwWaitLink from being visible only
>> momentarily. If a backend is woken up and detects that lwWaiting is false,
>> it knows that it cannot be on any wait queue - it was just removed, and
>> hasn't added itself again yet. At that point, it's safe to reset lwWaitLink
>> back to NULL. The stray value would thus only be visible while a backend executes
>> the PGSemaphoreLock() loop, and whether or not this is the case can be deduced
>> from a stack trace. So I'm not convinced that this makes debugging any harder.
>
> That's not actually safe on an out of order architecture afaics. Without
> barriers the store to lwWaitLink in the woken up backend can preempt the
> read for the next element in the PGSemaphoreUnlock() loop.
Hm, true. I guess we could prevent that by introducing an artificial dependency
between the read and the write - something like
proc = head; head = proc->lwWaitLink /* * We don't ever expect to actually PANIC here, but the test forces the *
loadto execute before the store to lwWaiting. This prevents a race * between reading lwWaitLink here and resetting it
backto zero in the * awoken backend (Note that backends can wake up spuriously, so just * reading it before doing
PGSemaphoreUnlockis insufficient) */ if (head != MyProc) proc->lwWaiting = false; else elog(PANIC, ...)
PGSemaphoreUnlock(&proc->sem);
(This assumes that proc is a volatile pointer)
Another idea would be to do as you suggest and only mark the PGPROC pointers
volatile, but to additionally add a check for queue corruption somewhere. We should
be able to detect that - if we ever hit this issue, LWLockRelease should find a
PGPROC while traversing the queue whose lwWaitLink is NULL but which isn't equal to
lock->tail. If that ever happens, we'd simply PANIC.
best regards,
Florian Pflug