Thread: Solution to UPDATE...INSERT problem
Hi Guys, I just thought I'd share with you guys a very clever solution to the old 'update row. if no rows affected, then insert the row' race condition problem. A guy at my work came up with it. We were discussing this earlier on -hackers, but no-one could find a solution that didn't involve locking the entire table around the update...insert commands. The problem is that sometimes the row will be inserted by another process between your update and insert, causing your insert to fail with a unique constraint violation. So, say this is the insert: INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column Rewrite it like this: INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE pkcol=1; See? So now that INSERT statement will insert the row if it doesn't exist, or insert zero rows if it does. You are then guaranteed that your transaction will not fail and rollback, so you can repeat your update, or do the insert first and then the update, etc. Hope that's handy for people, Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE > pkcol=1; > See? So now that INSERT statement will insert the row if it doesn't exist, > or insert zero rows if it does. You are then guaranteed that your > transaction will not fail and rollback, so you can repeat your update, or do > the insert first and then the update, etc. Uh, why exactly do you think this is race-free? It looks fancy, but AFAICS the SELECT will return info that is correct as of its starting timestamp; which is not enough to guarantee that the INSERT won't conflict with another transaction doing the same thing concurrently. regards, tom lane
> Uh, why exactly do you think this is race-free? > > It looks fancy, but AFAICS the SELECT will return info that is correct > as of its starting timestamp; which is not enough to guarantee that the > INSERT won't conflict with another transaction doing the same thing > concurrently. How about: INSERT INTO table SELECT 1, 'foo' WHERE NOT EXISTS (SELECT TRUE FROM table WHERE pkcol=1 FOR UPDATE); It's a lot more straightforward and has a FOR UPDATE. Can this still cause unique constraint failures? Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: >> Uh, why exactly do you think this is race-free? > How about: > INSERT INTO table SELECT 1, 'foo' WHERE NOT EXISTS (SELECT TRUE FROM table > WHERE pkcol=1 FOR UPDATE); > It's a lot more straightforward and has a FOR UPDATE. Can this still cause > unique constraint failures? Certainly. FOR UPDATE locks an existing row; it cannot lock the condition of non-existence of a row. regards, tom lane
El jue, 27-03-2003 a las 03:41, Tom Lane escribió: > "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > > INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE > > pkcol=1; > > > See? So now that INSERT statement will insert the row if it doesn't exist, > > or insert zero rows if it does. You are then guaranteed that your > > transaction will not fail and rollback, so you can repeat your update, or do > > the insert first and then the update, etc. > > Uh, why exactly do you think this is race-free? > > It looks fancy, but AFAICS the SELECT will return info that is correct > as of its starting timestamp; which is not enough to guarantee that the > INSERT won't conflict with another transaction doing the same thing > concurrently. This approach certainly reduces significantly the time span within which a race could occur, compared to, say, using two separate statements, or worse, two statements in two consecutive transactions. But race conditions either exist or they don't, so you're right. Now, up to my knowledge this problem was only a an intractable one in PostgreSQL because of transactions going into abort state once a unique restriction violation happened. If savepoints/nested transactions were there, one would simply put the insert within a protected area, and retry as much as needed. That's my bet why other databases don't seem to have a problem with this one. Am I right? Or do they have some magic solution other than locking the whole table? Regards Haroldo
AFAIK the "except" select won't see other inserts in uncommitted transactions. If those transactions are committed you will end up with the same problem. You can try it yourself, by manually doing two separate transactions in psql. You either have to lock the whole table, or lock at the application layer. Some time back I suggested a "lock on arbitrary string" feature for postgresql for this and various other purposes, but that feature probably wouldn't scale in terms of management (it requires 100% cooperation amongst all apps/clients involved). There's no "select * from table where pkey=x for insert;" which would block on uncommitted inserts/updates of pkey=x and other selects for insert/update. In contrast "select ... for update" blocks on committed stuff. Regards, Link. At 09:55 AM 3/27/03 +0800, Christopher Kings-Lynne wrote: >Hi Guys, > >I just thought I'd share with you guys a very clever solution to the old >'update row. if no rows affected, then insert the row' race condition >problem. A guy at my work came up with it. > >We were discussing this earlier on -hackers, but no-one could find a >solution that didn't involve locking the entire table around the >update...insert commands. > >The problem is that sometimes the row will be inserted by another process >between your update and insert, causing your insert to fail with a unique >constraint violation. > >So, say this is the insert: > >INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column > >Rewrite it like this: > >INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE >pkcol=1; > >See? So now that INSERT statement will insert the row if it doesn't exist, >or insert zero rows if it does. You are then guaranteed that your >transaction will not fail and rollback, so you can repeat your update, or do >the insert first and then the update, etc. > >Hope that's handy for people, > >Chris > > >---------------------------(end of broadcast)--------------------------- >TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org