Thread: Solution to UPDATE...INSERT problem
Hi Guys, I just thought I'd share with you guys a very clever solution to the old 'update row. if no rows affected, then insert the row' race condition problem. A guy at my work came up with it. We were discussing this earlier on -hackers, but no-one could find a solution that didn't involve locking the entire table around the update...insert commands. The problem is that sometimes the row will be inserted by another process between your update and insert, causing your insert to fail with a unique constraint violation. So, say this is the insert: INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column Rewrite it like this: INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE pkcol=1; See? So now that INSERT statement will insert the row if it doesn't exist, or insert zero rows if it does. You are then guaranteed that your transaction will not fail and rollback, so you can repeat your update, or do the insert first and then the update, etc. Hope that's handy for people, Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE > pkcol=1; > See? So now that INSERT statement will insert the row if it doesn't exist, > or insert zero rows if it does. You are then guaranteed that your > transaction will not fail and rollback, so you can repeat your update, or do > the insert first and then the update, etc. Uh, why exactly do you think this is race-free? It looks fancy, but AFAICS the SELECT will return info that is correct as of its starting timestamp; which is not enough to guarantee that the INSERT won't conflict with another transaction doing the same thing concurrently. regards, tom lane
AFAIK the "except" select won't see other inserts in uncommitted transactions. If those transactions are committed you will end up with the same problem. You can try it yourself, by manually doing two separate transactions in psql. You either have to lock the whole table, or lock at the application layer. Some time back I suggested a "lock on arbitrary string" feature for postgresql for this and various other purposes, but that feature probably wouldn't scale in terms of management (it requires 100% cooperation amongst all apps/clients involved). There's no "select * from table where pkey=x for insert;" which would block on uncommitted inserts/updates of pkey=x and other selects for insert/update. In contrast "select ... for update" blocks on committed stuff. Regards, Link. At 09:55 AM 3/27/03 +0800, Christopher Kings-Lynne wrote: >Hi Guys, > >I just thought I'd share with you guys a very clever solution to the old >'update row. if no rows affected, then insert the row' race condition >problem. A guy at my work came up with it. > >We were discussing this earlier on -hackers, but no-one could find a >solution that didn't involve locking the entire table around the >update...insert commands. > >The problem is that sometimes the row will be inserted by another process >between your update and insert, causing your insert to fail with a unique >constraint violation. > >So, say this is the insert: > >INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column > >Rewrite it like this: > >INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE >pkcol=1; > >See? So now that INSERT statement will insert the row if it doesn't exist, >or insert zero rows if it does. You are then guaranteed that your >transaction will not fail and rollback, so you can repeat your update, or do >the insert first and then the update, etc. > >Hope that's handy for people, > >Chris > > >---------------------------(end of broadcast)--------------------------- >TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
so the only real solution to this now is in application code outside of a transatction, i.e. PHP,Perl,VB,C,Python, etc, right? Lincoln Yeoh wrote: > AFAIK the "except" select won't see other inserts in uncommitted > transactions. If those transactions are committed you will end up with > the same problem. You can try it yourself, by manually doing two > separate transactions in psql. > > You either have to lock the whole table, or lock at the application > layer. Some time back I suggested a "lock on arbitrary string" feature > for postgresql for this and various other purposes, but that feature > probably wouldn't scale in terms of management (it requires 100% > cooperation amongst all apps/clients involved). > > There's no "select * from table where pkey=x for insert;" which would > block on uncommitted inserts/updates of pkey=x and other selects for > insert/update. > > In contrast "select ... for update" blocks on committed stuff. > > Regards, > Link. > > At 09:55 AM 3/27/03 +0800, Christopher Kings-Lynne wrote: > >> Hi Guys, >> >> I just thought I'd share with you guys a very clever solution to the old >> 'update row. if no rows affected, then insert the row' race condition >> problem. A guy at my work came up with it. >> >> We were discussing this earlier on -hackers, but no-one could find a >> solution that didn't involve locking the entire table around the >> update...insert commands. >> >> The problem is that sometimes the row will be inserted by another process >> between your update and insert, causing your insert to fail with a unique >> constraint violation. >> >> So, say this is the insert: >> >> INSERT INTO table VALUES (1, 'foo'); // 1 is in the primary key column >> >> Rewrite it like this: >> >> INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE >> pkcol=1; >> >> See? So now that INSERT statement will insert the row if it doesn't >> exist, >> or insert zero rows if it does. You are then guaranteed that your >> transaction will not fail and rollback, so you can repeat your update, >> or do >> the insert first and then the update, etc. >> >> Hope that's handy for people, >> >> Chris >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
El jue, 27-03-2003 a las 03:41, Tom Lane escribió: > "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > > INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE > > pkcol=1; > > > See? So now that INSERT statement will insert the row if it doesn't exist, > > or insert zero rows if it does. You are then guaranteed that your > > transaction will not fail and rollback, so you can repeat your update, or do > > the insert first and then the update, etc. > > Uh, why exactly do you think this is race-free? > > It looks fancy, but AFAICS the SELECT will return info that is correct > as of its starting timestamp; which is not enough to guarantee that the > INSERT won't conflict with another transaction doing the same thing > concurrently. This approach certainly reduces significantly the time span within which a race could occur, compared to, say, using two separate statements, or worse, two statements in two consecutive transactions. But race conditions either exist or they don't, so you're right. Now, up to my knowledge this problem was only a an intractable one in PostgreSQL because of transactions going into abort state once a unique restriction violation happened. If savepoints/nested transactions were there, one would simply put the insert within a protected area, and retry as much as needed. That's my bet why other databases don't seem to have a problem with this one. Am I right? Or do they have some magic solution other than locking the whole table? Regards Haroldo
At 05:28 PM 3/27/03 +0800, Christopher Kings-Lynne wrote: > > There's no "select * from table where pkey=x for insert;" which would >block > > on uncommitted inserts/updates of pkey=x and other selects for >insert/update. > >How about user locks? Isn't there something in contrib/ for that??? I >could do a userlock on the primary key, whether it existed or not? Depends on your case, whether you can correctly convert your potential primary keys into integers to be locked on. It still requires full cooperation by all relevant apps/clients. Actually select ... for updates also require cooperation, but it's a standard way of doing things, so apps that don't cooperate can be said to be broken :). Is there a standard for "select ... for insert"? Or lock table for insert where pkey=x? Regards, Link.