Re: invalid tid errors in latest 7.3.4 stable. - Mailing list pgsql-hackers
From | Kevin Brown |
---|---|
Subject | Re: invalid tid errors in latest 7.3.4 stable. |
Date | |
Msg-id | 20030927100021.GC6073@filer Whole thread Raw |
In response to | Re: invalid tid errors in latest 7.3.4 stable. (Stephan Szabo <sszabo@megazone.bigpanda.com>) |
List | pgsql-hackers |
Stephan Szabo wrote: > > The problem in the scenario you described should be solved if we mark any > > rows that are selected with the "for update" option (either implicitly, > > as with RI triggers, or explicitly) as having been modified by the > > selecting transaction, the equivalent of (in the case of T2*) "update pk > > set id=id where id=1" but without firing any of the ON MODIFY triggers. > > A rollback would, of course, not have any effect on the data in those > > rows since there weren't any real changes. This "fix" won't work, > > of course, if the serialization code is so broken that it doesn't work > > properly even in the face of updates (something I'd find hard to believe). > > That fixes the case above which will fix the ri constraints for right now > (although they really have to stop using for update eventually), but > doesn't really solve the serialization problem since it still exists > AFAICS without for update. Without the for update, you still have T2* > getting 1 row and T1* getting 0 which can't happen for either ordering of > the transactions. It gets worse if that select as a holder at the > beginning of T1 was say select * from fk where id=1 because SQL tells us > that the later select can't see a different set of rows from the earlier > one, so T2 shouldn't be allowed to commit before T1. That's what I was afraid of, and what I figured serialization really meant: what you see is a snapshot of the database as it was at transaction start time. I can't think of any good way to implement proper serialization without destroying a serialized transaction's read performance, because it seems to me that the only way to properly implement serialization is to somehow record on-disk all the rows a serializable transaction visits, which means that a serializable transaction is going to be *much* slower than a read-committed transaction. You have to mark such rows because other transactions (even read-committed transactions) have to abort if they attempt to modify such a row, and the list of such rows can grow far too large to record it in shared memory. Worse, you have to maintain a dynamic list of serializable transactions that have seen the row and remove a transaction from the list once it commits or rolls back, because the only time a transaction needs to care about this when changing a row is when there's a currently-running transaction that's seen it. We could use the MVCC mechanism to implement it: duplicate the row being examined and assign the reader's transaction ID to the duplicate just as if it had modified the row. But you also have to somehow flag the duplicate as being there as a result of a serializable read, so that other serializable transactions that try to modify the row after the one in question has committed won't themselves throw a serialization error (because without the flag they'd think they were attempting to read a row that had been modified by someone else during their lifetime). The other situation you have to deal with is when you have two transactions, 1 and 2, that start and commit in that order but which have overlapping times of execution. If transaction 1 modifies a row after transaction 2 starts, then commits before transaction 2 reads it, transaction 2 has to be able to detect that and throw a serialization error. The way around that problem is to assign a commit ID to each transaction at commit time. The commit ID is just the transaction ID that will be assigned to the next transaction that runs. It might make sense for assignment of commit IDs to increment the transaction ID counter the way assignment of a transaction ID does. Anyway, if a serializable transaction reads a row that has a commit ID greater than the reader's transaction ID, it throws a serialization error. It's probably sufficient to store the commit ID along with the transaction ID of the committer in the transaction log as well as in shared memory, so that the commit ID can be quickly looked up from the transaction ID. Maybe there's a better way around all this, but I certainly can't think of one at the moment. :-( -- Kevin Brown kevin@sysexperts.com
pgsql-hackers by date: