Re: Serializable Snapshot Isolation - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: Serializable Snapshot Isolation |
Date | |
Msg-id | AANLkTikkwxs56YkVvW3AzTYxxvjMKj2Nf-nZmxZsfFC0@mail.gmail.com Whole thread Raw |
In response to | Re: Serializable Snapshot Isolation (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: Serializable Snapshot Isolation
|
List | pgsql-hackers |
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > When a transaction is commits, its predicate locks must be held, > but it's not important anymore *who* holds them, as long as > they're hold for long enough. > > Let's move the finishedBefore field from SERIALIZABLEXACT to > PREDICATELOCK. When a transaction commits, set the finishedBefore > field in all the PREDICATELOCKs it holds, and then release the > SERIALIZABLEXACT struct. The predicate locks stay without an > associated SERIALIZABLEXACT entry until finishedBefore expires. > > Whenever there are two predicate locks on the same target that > both belonged to an already-committed transaction, the one with a > smaller finishedBefore can be dropped, because the one with higher > finishedBefore value covers it already. I don't think this works. Gory details follow. The predicate locks only matter when a tuple is being written which might conflict with one. In the notation often used for the dangerous structures, the conflict only occurs if TN writes something which T1 can't read or T1 writes something which T0 can't read. When you combine this with the fact that you don't have a problem unless TN commits *first*, then you can't have a problem with TN looking up a predicate lock of a committed transaction; if it's still writing tuples after T1's commit, the conflict can't matter and really should be ignored. If T1 is looking up a predicate lock for T0 and finds it committed, there are two things which must be true for this to generate a real conflict: TN must have committed before T0, and T0 must have overlapped T1 -- T0 must not have been able to see T1's write. If we have a way to establish these two facts without keeping transaction level data for committed transactions, predicate lock *lookup* wouldn't stand in the way of your proposal. Since the writing transaction is active, if the xmin of its starting transaction comes before the finishedBefore value, they must have overlapped; so I think we have that part covered, and I can't see a problem with your proposed use of the earliest finishedBefore value. There is a rub on the other point, though. Without transaction information you have no way of telling whether TN committed before T0, so you would need to assume that it did. So on this count, there is bound to be some increase in false positives leading to transaction rollback. Without more study, and maybe some tests, I'm not sure how significant it is. (Actually, we might want to track commit sequence somehow, so we can determine this with greater accuracy.) But wait, the bigger problems are yet to come. The other way we can detect conflicts is a read by a serializable transaction noticing that a different and overlapping serializable transaction wrote the tuple we're trying to read. How do you propose to know that the other transaction was serializable without keeping the SERIALIZABLEXACT information? And how do you propose to record the conflict without it? The wheels pretty much fall off the idea entirely here, as far as I can see. Finally, this would preclude some optimizations which I *think* will pay off, which trade a few hundred kB more of shared memory, and some additional CPU to maintain more detailed conflict data, for a lower false positive rate -- meaning fewer transactions rolled back for hard-to-explain reasons. This more detailed information is also what seems to be desired by Dan S (on another thread) to be able to log the information needed to be able to reduce rollbacks. -Kevin
pgsql-hackers by date: