Thread: Serializable access giving wrong error messages?
I have some problems in serializable access. Either I have miss understood the documentation or there is a bug in PostgreSQL. I have two examples. On both of them I have two psql windows open. I start two transactions simultaniously. I shouldn't be able to see any results (except the message 'ERROR: Can't serialize access due to concurrent update') from the other transaction. At least this is how I understood it by reading PostgreSQL 7.2 User's Guide's chapter 9.4 (Serializable Isolation Level). But as you can see from the two examples I can see changes in the table which are part of another transaction. Example one: In the transaction one I first check that there isn't a duplicate row in the table. Then I try to insert in it and I get 'ERROR: Cannot insert a duplicate key into unique index test_pkey'. This shouldn't be possible. Transaction 1 (window 1) ! Transaction 2 (window 2) --------------------------------------------------------------------------- testi=# create table test ( a ! integer, primary key (a)); ! NOTICE: CREATE TABLE / PRIMARY ! KEY will create implicit index ! 'test_pkey' for table 'test' ! CREATE ! testi=# begin; ! BEGIN ! testi=# set transaction isolation! level serializable; ! SET VARIABLE ! ! testi=# begin; ! BEGIN ! testi=# set transaction isolation ! level serializable; ! SET VARIABLE testi=# select * from test; ! a ! --- ! (0 rows) ! ! testi=# insert into test (a) ! values(5); ! INSERT 121643 1 ! testi=# commit; ! COMMIT testi=# select * from test; ! a ! --- ! (0 rows) ! testi=# insert into test (a) ! values(5); ! ERROR: Cannot insert a duplicate! key into unique index test_pkey ! testi=# commit; ! COMMIT ! testi=# drop table test; ! DROP ! Example 2: I first delete a row with a primary key '5'. Then I try to insert a new row with the same primary key but get: 'ERROR: Cannot insert a duplicate key into unique index test_pkey'. Transaction 1 (window 1) ! Transaction 2 (window 2) --------------------------------------------------------------------------- testi=# create table test ( a ! integer, primary key (a)); ! NOTICE: CREATE TABLE / PRIMARY ! KEY will create implicit index ! 'test_pkey' for table 'test' ! CREATE ! testi=# begin; ! BEGIN ! testi=# set transaction isolation! level serializable; ! SET VARIABLE ! ! testi=# begin; ! BEGIN ! testi=# set transaction isolation ! level serializable; ! SET VARIABLE testi=# delete from test where ! a=5; ! DELETE 0 ! ! testi=# insert into test (a) ! values(5); ! INSERT 149083 1 ! testi=# commit; ! COMMIT testi=# insert into test (a) ! values(5); ! ERROR: Cannot insert a duplicate! key into unique index test_pkey ! testi=# commit; ! COMMIT ! testi=# drop table test; ! DROP ! I'm running PostgreSQL 7.2b4. I got similar results with 7.1.2. And I know that on both cases the transaction should fail but the error message is wrong.
Mikko Vierula <mikko.vierula@elektroniikkatyo.fi> writes: > I'm running PostgreSQL 7.2b4. I got similar results with 7.1.2. And I > know that on both cases the transaction should fail but the error > message is wrong. What would you have it do differently? Accept the insert and then give some random error message at the commit? I'm sorry, but I don't see a problem here. regards, tom lane
Mikko Vierula <mikko.vierula@elektroniikkatyo.fi> writes: > But all > those errors really are because of serialization problems. So shouldn't > I be receiving a error stating that? I disagree, because I don't think it's reasonable to expect the system to make that deduction. I prefer a specific error message telling you what's actually wrong ("duplicate key") to a vague error message that might in fact be incorrect (leaping to a "can't serialize access" conclusion). In the example you give, the reason that you as an intelligent human can classify the error as a serialization problem is that earlier in the transaction you searched for the key and didn't find it. Had you not done that, you could not argue that "duplicate key" is the wrong message. Now, is the system supposed to remember that there was such a search, and do the logical deductions needed to correlate the previous WHERE clause to the immediate cause of failure? Sorry, I don't think so. We'd have to remember the conditions and outputs of every SELECT throughout every transaction in order to adjust this error message. That's an unrealistic amount of overhead for a mighty small return. I counsel tweaking your program logic so that a "duplicate key" error at this point is treated as a retryable failure. regards, tom lane
Tom Lane wrote: > > Mikko Vierula <mikko.vierula@elektroniikkatyo.fi> writes: > > But all > > those errors really are because of serialization problems. So shouldn't > > I be receiving a error stating that? > > I disagree, because I don't think it's reasonable to expect the system > to make that deduction. I prefer a specific error message telling you > what's actually wrong ("duplicate key") to a vague error message that > might in fact be incorrect (leaping to a "can't serialize access" > conclusion). > > In the example you give, the reason that you as an intelligent human can > classify the error as a serialization problem is that earlier in the > transaction you searched for the key and didn't find it. Had you not > done that, you could not argue that "duplicate key" is the wrong message. PostgreSQL doesn't need to remember the preceding operations at all in order to judge if a "can't serialize access" error is occurred. In Mikko's both examples, Transaction 1 has to see a tuple which didn't exist at the beginning of the transaction while checking duplicate error. regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > In Mikko's both examples, Transaction 1 has to see a tuple > which didn't exist at the beginning of the transaction > while checking duplicate error. I'm not sure about that. What if the "tuple that didn't exist" is an updated version of a row that did exist earlier --- that is, the conflicting operation is an update not an insert? Does your answer change depending on whether the update changed the row's key value? In the most straightforward implementation of your suggestion, I believe that a concurrent update (on a non-key column) would result in the system randomly delivering either "duplicate key" or "serialization error" depending on whether the index processing happened to look at the older or newer other tuple first. That definitely does not strike me as an improvement over the current behavior. In any case, I still think that this'd be trading a useful error message (ie, one that tells you what is wrong) for a uselessly non-specific one. regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > In Mikko's both examples, Transaction 1 has to see a tuple > > which didn't exist at the beginning of the transaction > > while checking duplicate error. > > I'm not sure about that. What if the "tuple that didn't exist" is > an updated version of a row that did exist earlier --- that is, the > conflicting operation is an update not an insert? Does your answer > change depending on whether the update changed the row's key value? Displaying both "can't serialize .." and "cannot insert a duplicate .." seems better. There's another case. # create table t (id int4 primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 't_pkey' for table 't' CREATE # insert into t values (1); INSERT 1481246 1 [session-1] visco=# begin; BEGIN visco=# set transaction isolation level serializable; SET VARIABLE visco=# select * from t; id ---- 1 (1 row) [session-2] # delete from t; DELETE 1 [session-1] # insert into t values (2); INSERT 1481247 1 IMHO this should cause a "can't serialize .." error. > > In the most straightforward implementation of your suggestion, I believe > that a concurrent update (on a non-key column) would result in the > system randomly delivering either "duplicate key" or "serialization > error" depending on whether the index processing happened to look at > the older or newer other tuple first. It depends on what *happened to look* means. Currently PostgreSQL ignores the update/deleted tuples from the first which don't satisfy the qualification at the point of snapshot. I mean such tuples by *happened to look*. regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > Currently PostgreSQL ignores the update/deleted > tuples from the first which don't satisfy the > qualification at the point of snapshot. It does? It looks to me like we use SnapshotDirty time qual check and wait for any uncommitted transaction. This essentially means that we use "latest committed" state, not the query-start snapshot. Look at _bt_check_unique in nbtinsert.c. regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > Currently PostgreSQL ignores the update/deleted > > tuples from the first which don't satisfy the > > qualification at the point of snapshot. > > It does? It looks to me like we use SnapshotDirty time qual check > and wait for any uncommitted transaction. This essentially means > that we use "latest committed" state, not the query-start snapshot. I don't refer to SnapshotDirty(Now/Self/Any) as the snapshot because they aren't real snapshots. What I meant is e.g. update t set .. = .. where ...; PostgreSQL ignores the tuples which don't satisfy the where clause at the point when the query started(or when the transaction started in serializable mode) even though it happend to look already update/delete (/insert)ed tuples. regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > What I meant is e.g. > update t set .. = .. where ...; > PostgreSQL ignores the tuples which don't satisfy the > where clause at the point when the query started(or > when the transaction started in serializable mode) > even though it happend to look already update/delete > (/insert)ed tuples. Sure, but what we were discussing was the "duplicate key" error issued by nbtinsert.c. AFAICS that depends on latest-commit status, not on snapshots. Perhaps that behavior is itself a bug? If so, how would you change it? regards, tom lane
Tom Lane wrote: > > Sure, but what we were discussing was the "duplicate key" error issued > by nbtinsert.c. AFAICS that depends on latest-commit status, not on > snapshots. Perhaps that behavior is itself a bug? If so, how would > you change it? Addtional check seems to be needed in serializable mode using the serialzable shot though I'm not sure. currenly dead but was alive when the transaction begin curretly alive but didin't exist when ... etc may cause a "can't serialize ..." error. regards, Hiroshi Inoue
Tom Lane wrote: > What would you have it do differently? Accept the insert and then give > some random error message at the commit? I'm sorry, but I don't see > a problem here. > regards, tom lane Thanks Tom I would like to see error message 'ERROR: Can't serialize access due to concurrent updates' during the transaction (after insert or commit). The difference is that by receiving that message I know that there is nothing wrong in the sql statements, in the database or in the program logic. After that I could retry. If I get any other error message I cannot tell should I retry or not. In my case I have a table where I have timestamped values. For each timestamp there can be only one value. If a get a new value with the same timestamp I need to overwrite the old one. This must happend in one transaction. By first deleteng and then inserting a value in one transaction I can be sure that I don't delete anything without inserting and it also works fine when there isn't anything to delete. But now I get "random" error messages. I could check for 'ERROR: Cannot insert a duplicate key into unique index ?', but for more complicated cases I would have to check for many other error messages too. And because there aren't really any numeric error codes that would be impossible. But all those errors really are because of serialization problems. So shouldn't I be receiving a error stating that? regards Mikko