Alvaro Herrera wrote:
> The fix for the immediate bug is to add some code to HTSU so that it
> checks for locks by other transactions even when the tuple was created
> by us. I haven't looked at the other tqual routines yet, but I imagine
> they will need equivalent fixes.
This POC patch changes the two places in HeapTupleSatisfiesUpdate that
need to be touched for this to work. This is probably too simplistic,
in that I make the involved cases return HeapTupleBeingUpdated without
checking that there actually are remote lockers, which is the case of
concern. I'm not yet sure if this is the final form of the fix, or
instead we should expand the Multi (in the cases where there is a multi)
and verify whether any lockers are transactions other than the current
one. As is, nothing seems to break, but I think that's probably just
chance and should not be relied upon.
Attached are two isolation specs which illustrate both the exact issue
reported by Dan, and a similar one which involves an aborted
subtransaction having updated the second version of the row. (This
takes a slightly different code path.)
As far as I can tell, no other routine in tqual.c needs to change other
than HeapTupleSatisfiesUpdate. The ones that test for visibility
(Dirty, Any, Self) are only concerned with whether the tuple is visible,
and of course that won't be affected by the tuple being locked; and
HeapTupleSatisfiesVacuum is only concerned with the tuple being dead,
which similarly won't.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services