Re: incoherent view of serializable transactions - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: incoherent view of serializable transactions |
Date | |
Msg-id | 4950A677.EE98.0025.0@wicourts.gov Whole thread Raw |
In response to | Re: incoherent view of serializable transactions (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: incoherent view of serializable transactions
Re: incoherent view of serializable transactions |
List | pgsql-hackers |
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> At this point, SERIALIZABLE transactions appear to have worked, with >> receipt 3 happening before the update of deposit_date; however, there >> was a window of time when the update to deposit_date was visible and >> receipt 3 was not. > >> This absolutely can't happen in a standard-compliant implementation. > > I think you mean "you'd like to believe that can't happen in a > standard-compliant implementation". "The execution of concurrent SQL-transactions at isolation level SERIALIZABLE is guaranteed to be serializable. A serializable execution is defined to be an execution of the operations of concurrently executing SQL-transactions that produces the same effect as some serial execution of those same SQL-transactions. A serial execution is one in which each SQL-transaction executes to completion before the next SQL-transaction begins." If you look at the serializable queries in the original post for this thread, it's not hard to see that this standard is not met. The insert of receipt 3 appears to happen before the update of the control record, since it has the old deposit date. The transaction which selects from both tables sees the update to the control record, so it must come after that. Yet it doesn't see the results of the first transaction. There is no sequence of serial execution which is consistent with the behavior. Perhaps you are suggesting that it's not possible in a practical sense for a DBMS to meet this standard? See below. > It doesn't include any of the > specific behaviors that are forbidden by the spec, though, so I'm less > than convinced. Following the diagram of specific behaviors allowed at each level is good evidence that these phenomena don't *define* serializable: "NOTE 53 * The exclusion of these phenomena for SQL-transactions executing at isolation level SERIALIZABLE is a consequence of the requirement that such transactions be serializable." > An appropriate way to prevent the problem is probably for the > transaction that changes the deposit_date to take out a write-excluding > lock on the receipts table before it does so. Well, a serializable transaction operating to standard would probably take out a write-excluding lock on the control table row when it is read. This would block the update to the control table until completion of any pending receipts on the old date. The blocked update would then take out a read-excluding lock before updating the row, which would then block receipt transactions looking to check the date. As soon as the update to the control table is committed, you can see the new date, and you can have confidence that the old date's receipts will all show on a SELECT. Again, I'm not suggesting that we change the behavior of PostgreSQL to match this, but that we document the difference so that those looking to switch to PostgreSQL from databases which provide true serializable transactions know that they have to explicitly lock rows to maintain a coherent view of the data. In the queries shown in the original post, the INSERT of the receipts could read the deposit date FOR SHARE to provide the equivalent functionality, although they would have to be rejiggered a bit, since that isn't allowed in subqueries. This isn't some hypothetical "maybe some day some product might implement this, but it'll never catch on" sort of thing -- Microsoft and Sybase SQL Server had this from version 1. I used it from 1990 until the conversion to PostgreSQL over the last couple years. Serializable transactions worked as advertised. Was there more blocking than PostgreSQL? Yes. Were there more deadlocks than PostgreSQL? Yes. Did it impact performance? Well, PostgreSQL is faster by enough that users commented that the applications seemed "snappier" when we switched the database underneath them from Sybase to PostgreSQL. I'm going on second-hand information here, but I'm told that IBM DB2 has used similar techniques to provide true serializable transactions for even longer. I'm somewhat mystified at the reaction this topic gets here. :-/ -Kevin
pgsql-hackers by date: