Re: Suggestion for optimization - Mailing list pgsql-hackers
From | Jon Grov |
---|---|
Subject | Re: Suggestion for optimization |
Date | |
Msg-id | jon-x9zofgxu6ku.fsf@einn.ifi.uio.no Whole thread Raw |
In response to | Re: Suggestion for optimization ("Dann Corbit" <DCorbit@connx.com>) |
List | pgsql-hackers |
(Sorry that my previous post did not reach the pgsql-hackers list, I sent it from the wrong address and was thus not considered a subscriber) "Dann Corbit" <DCorbit@connx.com> writes: > But I am a bit puzzled. How can a serializable transaction be > performed in a MVCC system? I realize the Oracle does it, and also > Postgresql, but I can't picture how that would work. In short, snapshot isolation implies this (assuming all transactions are assigned a monotonously growing timestamp, and that all updates create a new, unique version of the object): - A transactions reads the most recent version of an object that is written by a transaction who were committed at it's beginning. - Two concurrent (i.e. at some point in time, they're both active) transactions need to have disjunct write sets. This is probably best illustrated through an example: Let x and y be columns in a table a. Initially, x = 20 and y = 5 in the row where k = 1 (k is the primary key). - Let T_1 be transaction as follows: BEGIN; SELECT x FROM a WHERE k = 1; SELECT y FROM a WHERE k = 1; END; - Let T_2 be another transaction: BEGIN; UPDATE a SET x = 10 WHERE k = 1; UPDATE a SET y = 10 WHERE k = 1; END; Then, if we have the following execution: 10 BEGIN; /* T_2 */ 20 BEGIN; /* T_1 */ 30 SELECT x FROM a WHERE k = 1; /* T_1 */ 40 UPDATE a SET x = 10 WHERE k = 1; /* T_2 */ 50 UPDATE y SET a = 10 WHERE k = 1; /* T_2 */ 60 SELECT y FROM a WHERE k = 1; /* T_1 */ 70 END; /* T_2 */ 80 END; /* T_1 */ Clearly, this would not be serializable in a non-versioned database. But under MVCC and snapshot isolation, T_1 reads from a snapshot, i.e. it would not (under serialable isolation level, at least) be allowed to read anything T_2 have written. The requirement is that a transaction T reads values written by transactions committed before T's beginning. So T_1 would find that x = 20 and y = 5, just as it would if T_1 and T_2 were executed serially with T_1 before T_2. With two (or more) updating transactions, things get more complicated. Assume we have the following to transactions: T_1: BEGIN; UPDATE x SET x = x + 10; END; T_2: BEGIN; UPDATE x SET x = x - 5; END; and the following execution (assuming x = 20 before it starts): 10 BEGIN; /* T_1 */ 20 BEGIN; /* T_2 */ 30 UPDATE x SET x = x + 10; /* T_1 */ 40 END; /* T_1 */ 50 UPDATE x SET x = x - 5; /* T_2 */ 60 END; /* T_2 * Since a transaction is only allowed to read values written by transactions committed before it's beginning, both T_1 and T_2 will read x = 20. A transaction started after just this execution would then read T_2's newly written value of x, that is 15, and line 40 would become a lost update. To avoid this, PostgreSQL offers two solutions: - Read committed isolation, where a statement on the form UPDATE <table> SET <column> = <column> + <value> is considereda special case and T_2 is allowed to read T_1's value. - Serializable isolation, where T_2 would have to be aborted. If line 40 and 50 were swapped, T_2 would wait to see what happens to T_1. If it's aborted, it can safely read x = 20 regardless of the isolation level, if it's committed, the result would again depend on the selected isolation level. Hopefully, this illustrates the basic concepts. An interesting article concerning the subtleties of this subject was posted by Tom Lane a couple of days ago: http://groups.google.com/groups?q=postgresql+regression&hl=en&ie=utf-8&oe=utf-8&scoring=d&selm=11556.1017860660%40sss.pgh.pa.us&rnum=4 In addition, this seems to be the "canonical paper" on snapshot isolation: http://citeseer.nj.nec.com/berenson95critique.html -- Jon Grov, Linpro as
pgsql-hackers by date: