On 23.09.2013 01:07, Hannu Krosing wrote:
> On 09/20/2013 12:55 PM, Heikki Linnakangas wrote:
>> Hi,
>>
>> Prompted by Andres Freund's comments on my Freezing without Write I/O
>> patch, I realized that there's there's an existing bug in the way
>> predicate locking handles freezing (or rather, it doesn't handle it).
>>
>> When a tuple is predicate-locked, the key of the lock is ctid+xmin.
>> However, when a tuple is frozen, its xmin is changed to FrozenXid.
>> That effectively invalidates any predicate lock on the tuple, as
>> checking for a lock on the same tuple later won't find it as the xmin
>> is different.
>>
>> Attached is an isolationtester spec to demonstrate this.
> The case is even fishier than that.
>
> That is, you can get bad behaviour on at least v9.2.4 even without
> VACUUM FREEZE.
>
> You just need to run
>
> permutation "r1" "r2" "w1" "w2" "c1" "c2"
>
> twice in a row.
>
> the first time it does get serialization error at "c2"
> but the 2nd time both "c1" and "c2" complete successfully
Oh, interesting. I did some debugging on this: there are actually *two*
bugs, either one of which alone is enough to cause this on its own:
1. in heap_hot_search_buffer(), the PredicateLockTuple() call is passed
wrong offset number. heapTuple->t_self is set to the tid of the first
tuple in the chain that's visited, not the one actually being read.
2. CheckForSerializableConflictIn() uses the tuple's t_ctid field
instead of t_self to check for exiting predicate locks on the tuple. If
the tuple was updated, but the updater rolled back, t_ctid points to the
aborted dead tuple.
After fixing both of those bugs, running the test case twice in a row
works, ie. causes a conflict and a rollback both times. Anyone see a
problem with this?
That still leaves the original problem I spotted, with freezing; that's
yet another unrelated bug.
- Heikki