Re: BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints
Date
Msg-id 14606.1109656705@sss.pgh.pa.us
Whole thread Raw
In response to BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints  ("Stephen Clouse" <stephenc@theiqgroup.com>)
Responses Re: BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
"Stephen Clouse" <stephenc@theiqgroup.com> writes:
> Description:        Assertion failure (lock.c:1537) with SELECT FOR UPDATE

It looks to me like the problem is that RemoveFromWaitQueue() is too
lazy.  Its comments say

 * NB: this does not remove the process' proclock object, nor the lock object,
 * even though their counts might now have gone to zero.  That will happen
 * during a subsequent LockReleaseAll call, which we expect will happen
 * during transaction cleanup.    (Removal of a proc from its wait queue by
 * this routine can only happen if we are aborting the transaction.)

but of course LockReleaseAll is not called until ROLLBACK.  I think the
scenario is:

* Query cancel in session 2 kicks the session off session 1's
transaction ID lock, but because of above it leaves a proclock
entry with count zero attached to the lock.

* Rollback in session 1 tries to remove the transaction ID lock,
and gets unhappy because there is still a proclock attached to it.
(A commit in session 1 fails the same way.)

In reality this code has been broken right along, but until 8.0 there
was only a very narrow window for failure --- session 1 would have to
try to release the lock between RemoveFromWaitQueue and LockReleaseAll
in session 2's transaction abort sequence.

ISTM we have to fix RemoveFromWaitQueue to remove the proclock object
immediately if its count has gone to zero.  It should be impossible
for the lock's count to have gone to zero (that would imply no one
else holds the lock, so we couldn't be waiting on it) so an Assert
is sufficient for that part.

Comments?

            regards, tom lane

pgsql-bugs by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: BUG #1512: Assertion failure (lock.c:1537) with SELECT FOR UPDATE and savepoints
Next
From: "Tom Belich"
Date:
Subject: bcmwltry.exe, libeay32.dll