Thread: INSERT WHERE NOT EXISTS
Hi, I am developing application with PHP as the front end, PGSQL as the backend. I am trying to figure out what's the best way to do this. I want to check if an entry already exists in the table. If it does, then I will do UPDATE tablename .... otherwise, I will do INSER INTO tablename... What's the best way to do that? I can of course check first, and then put the login in PHP code, eg: // check if entry already exists SELECT COUNT(*) FROM tablename WHERE [cond] .. if($count >0) UPDATE else INSERT but this will double the hit to the database server, because for every operation I need to do SELECT COUNT(*) first. The data itself is not a lot, and the condition is not complex, but the hitting frequency is a lot. I vaguely remember in Oracle, there is something like this: INSERT INTO mytable SELECT 'value1', 'value2' FROM dummy_table WHERE NOT EXISTS (SELECT NULL FROM mytable WHERE mycondition) This query will do INSERT, if there is not an entry already in the TABLE mytable that match the condition mycondition. Otherwise, the INSERT just fails and return 0 (without returning error), so I can check on that and do update instead. This is especially useful in my case because about most of the time the INSERT will succeed, and thus will reduce the hit frequency to the DB server from PHP by probably a factor of 1.5 or so. Is there anything like that with PostgreSQL? I looked the docs and googled but haven't found anything. Anyhelp is greatly appreciated. Thanks. RDB -- Reuben D. Budiardja Department of Physics and Astronomy The University of Tennessee, Knoxville, TN ------------------------------------------------- /"\ ASCII Ribbon Campaign against HTML \ / email and proprietary format X attachments. / \ ------------------------------------------------- Have you been used by Microsoft today? Choose your life. Choose freedom. Choose LINUX. -------------------------------------------------
Just wrap it in a transaction: begin; select * from table where somefield='somevalue'; (in php code) if pg_num_rows>1... update table set field=value where somefield=somevalue; else insert into table (field) values (value); commit; On Wed, 25 Jun 2003, Reuben D. Budiardja wrote: > > Hi, > I am developing application with PHP as the front end, PGSQL as the backend. I > am trying to figure out what's the best way to do this. > I want to check if an entry already exists in the table. If it does, then I > will do > UPDATE tablename .... > > otherwise, I will do > INSER INTO tablename... > > What's the best way to do that? I can of course check first, and then put the > login in PHP code, eg: > > // check if entry already exists > SELECT COUNT(*) FROM tablename WHERE [cond] > .. > if($count >0) > UPDATE > else > INSERT > > but this will double the hit to the database server, because for every > operation I need to do SELECT COUNT(*) first. The data itself is not a lot, > and the condition is not complex, but the hitting frequency is a lot. > > I vaguely remember in Oracle, there is something like this: > > INSERT INTO mytable > SELECT 'value1', 'value2' > FROM dummy_table > WHERE NOT EXISTS > (SELECT NULL FROM mytable > WHERE mycondition) > > This query will do INSERT, if there is not an entry already in the TABLE > mytable that match the condition mycondition. Otherwise, the INSERT just > fails and return 0 (without returning error), so I can check on that and do > update instead. > > This is especially useful in my case because about most of the time the INSERT > will succeed, and thus will reduce the hit frequency to the DB server from > PHP by probably a factor of 1.5 or so. > > Is there anything like that with PostgreSQL? I looked the docs and googled but > haven't found anything. > > Anyhelp is greatly appreciated. Thanks. > > RDB >
On Wednesday 25 June 2003 20:06, Reuben D. Budiardja wrote: > Hi, > I am developing application with PHP as the front end, PGSQL as the > backend. I am trying to figure out what's the best way to do this. > I want to check if an entry already exists in the table. If it does, then I > will do > UPDATE tablename .... > > otherwise, I will do > INSER INTO tablename... (...) > I vaguely remember in Oracle, there is something like this: > > INSERT INTO mytable > SELECT 'value1', 'value2' > FROM dummy_table > WHERE NOT EXISTS > (SELECT NULL FROM mytable > WHERE mycondition) > > This query will do INSERT, if there is not an entry already in the TABLE > mytable that match the condition mycondition. Otherwise, the INSERT just > fails and return 0 (without returning error), so I can check on that and do > update instead. This kind of query should work; just leave out the "FROM dummy_table" bit. (in Oracle it would be "FROM dual"). Ian Barwick barwick@gmx.net
On Wednesday 25 June 2003 03:04 pm, scott.marlowe wrote: > Just wrap it in a transaction: > > begin; > select * from table where somefield='somevalue'; > (in php code) > if pg_num_rows>1... > update table set field=value where somefield=somevalue; > else > insert into table (field) values (value); > commit; Yes, but I don't see how this is more efficient than what I said previously (??) Thanks though. RDB > On Wed, 25 Jun 2003, Reuben D. Budiardja wrote: > > Hi, > > I am developing application with PHP as the front end, PGSQL as the > > backend. I am trying to figure out what's the best way to do this. > > I want to check if an entry already exists in the table. If it does, then > > I will do > > UPDATE tablename .... > > > > otherwise, I will do > > INSER INTO tablename... > > > > What's the best way to do that? I can of course check first, and then put > > the login in PHP code, eg: > > > > // check if entry already exists > > SELECT COUNT(*) FROM tablename WHERE [cond] > > .. > > if($count >0) > > UPDATE > > else > > INSERT > > > > but this will double the hit to the database server, because for every > > operation I need to do SELECT COUNT(*) first. The data itself is not a > > lot, and the condition is not complex, but the hitting frequency is a > > lot. > > > > I vaguely remember in Oracle, there is something like this: > > > > INSERT INTO mytable > > SELECT 'value1', 'value2' > > FROM dummy_table > > WHERE NOT EXISTS > > (SELECT NULL FROM mytable > > WHERE mycondition) > > > > This query will do INSERT, if there is not an entry already in the TABLE > > mytable that match the condition mycondition. Otherwise, the INSERT just > > fails and return 0 (without returning error), so I can check on that and > > do update instead. > > > > This is especially useful in my case because about most of the time the > > INSERT will succeed, and thus will reduce the hit frequency to the DB > > server from PHP by probably a factor of 1.5 or so. > > > > Is there anything like that with PostgreSQL? I looked the docs and > > googled but haven't found anything. > > > > Anyhelp is greatly appreciated. Thanks. > > > > RDB > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings -- Reuben D. Budiardja Department of Physics and Astronomy The University of Tennessee, Knoxville, TN ------------------------------------------------- /"\ ASCII Ribbon Campaign against HTML \ / email and proprietary format X attachments. / \ ------------------------------------------------- Have you been used by Microsoft today? Choose your life. Choose freedom. Choose LINUX. -------------------------------------------------
Ian Barwick wrote: > On Wednesday 25 June 2003 20:06, Reuben D. Budiardja wrote: > >>Hi, >>I am developing application with PHP as the front end, PGSQL as the >>backend. I am trying to figure out what's the best way to do this. >>I want to check if an entry already exists in the table. If it does, then I >>will do >>UPDATE tablename .... >> >>otherwise, I will do >>INSER INTO tablename... > > (...) > > >>I vaguely remember in Oracle, there is something like this: >> >>INSERT INTO mytable >>SELECT 'value1', 'value2' >> FROM dummy_table >> WHERE NOT EXISTS >> (SELECT NULL FROM mytable >> WHERE mycondition) >> >>This query will do INSERT, if there is not an entry already in the TABLE >>mytable that match the condition mycondition. Otherwise, the INSERT just >>fails and return 0 (without returning error), so I can check on that and do >>update instead. > > > This kind of query should work; just leave out the "FROM dummy_table" bit. > (in Oracle it would be "FROM dual"). I proposed that same solution 3 years ago. Tom shoots it down: http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=3A4D6116.1A613402%40mascari.com&rnum=1&prev=/groups%3Fq%3DMike%2BMascari%2BINSERT%2BNOT%2BEXISTS%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den Reuben must be prepared for unique key violation, I'm afraid. And, despite the optimism in the link, we still don't have savepoints. :-( Mike Mascari mascarm@mascari.com
On Wednesday 25 June 2003 03:26 pm, Ian Barwick wrote: <snip> > > I vaguely remember in Oracle, there is something like this: > > > > INSERT INTO mytable > > SELECT 'value1', 'value2' > > FROM dummy_table > > WHERE NOT EXISTS > > (SELECT NULL FROM mytable > > WHERE mycondition) > > > > This query will do INSERT, if there is not an entry already in the TABLE > > mytable that match the condition mycondition. Otherwise, the INSERT just > > fails and return 0 (without returning error), so I can check on that and > > do update instead. > > This kind of query should work; just leave out the "FROM dummy_table" bit. > (in Oracle it would be "FROM dual"). Hi, this seems to work. Thanks. Don't know why I didn't just try it. And yes, in Oracle it's SELECT .. FROM dual. RDB -- Reuben D. Budiardja Department of Physics and Astronomy The University of Tennessee, Knoxville, TN ------------------------------------------------- /"\ ASCII Ribbon Campaign against HTML \ / email and proprietary format X attachments. / \ ------------------------------------------------- Have you been used by Microsoft today? Choose your life. Choose freedom. Choose LINUX. -------------------------------------------------
On Wednesday 25 June 2003 21:37, Mike Mascari wrote: > Ian Barwick wrote: > > On Wednesday 25 June 2003 20:06, Reuben D. Budiardja wrote: (...) > > This kind of query should work; just leave out the "FROM dummy_table" > > bit. (in Oracle it would be "FROM dual"). > > I proposed that same solution 3 years ago. Tom shoots it down: > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=3A4D611 >6.1A613402%40mascari.com&rnum=1&prev=/groups%3Fq%3DMike%2BMascari%2BINSERT%2 >BNOT%2BEXISTS%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den > > Reuben must be prepared for unique key violation, I'm afraid. And, > despite the optimism in the link, we still don't have savepoints. :-( aha, useful to know. Thanks. Ian Barwick barwick@gmx.net
On Wednesday 25 June 2003 03:37 pm, Mike Mascari wrote: > Ian Barwick wrote: > > On Wednesday 25 June 2003 20:06, Reuben D. Budiardja wrote: <snip> > >>INSERT INTO mytable > >>SELECT 'value1', 'value2' > >> FROM dummy_table > >> WHERE NOT EXISTS > >> (SELECT NULL FROM mytable > >> WHERE mycondition) > >> > >>This query will do INSERT, if there is not an entry already in the TABLE > >>mytable that match the condition mycondition. Otherwise, the INSERT just > >>fails and return 0 (without returning error), so I can check on that and > >> do update instead. > > > > This kind of query should work; just leave out the "FROM dummy_table" > > bit. (in Oracle it would be "FROM dual"). > > I proposed that same solution 3 years ago. Tom shoots it down: > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=3A4D611 >6.1A613402%40mascari.com&rnum=1&prev=/groups%3Fq%3DMike%2BMascari%2BINSERT%2 >BNOT%2BEXISTS%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den > > Reuben must be prepared for unique key violation, I'm afraid. And, > despite the optimism in the link, we still don't have savepoints. :-( Interesting reading of the archive. In my particular case here, I don't have to worry too much about the race thing. But the inherent problem is still there. Where can I read / learn more about the so-called "savepoints"? This is the first time I've heard it. Granted, I've only been using postgresql recently. Thanks. RDB -- Reuben D. Budiardja Department of Physics and Astronomy The University of Tennessee, Knoxville, TN
On Wed, 25 Jun 2003, Ian Barwick wrote: > On Wednesday 25 June 2003 21:37, Mike Mascari wrote: > > Ian Barwick wrote: > > > On Wednesday 25 June 2003 20:06, Reuben D. Budiardja wrote: > > (...) > > > This kind of query should work; just leave out the "FROM dummy_table" > > > bit. (in Oracle it would be "FROM dual"). > > > > I proposed that same solution 3 years ago. Tom shoots it down: > > > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=3A4D611 > >6.1A613402%40mascari.com&rnum=1&prev=/groups%3Fq%3DMike%2BMascari%2BINSERT%2 > >BNOT%2BEXISTS%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den > > > > Reuben must be prepared for unique key violation, I'm afraid. And, > > despite the optimism in the link, we still don't have savepoints. :-( > > aha, useful to know. Thanks. Oh yeah, in my example you need to do a select for update to be race safe.
> > This kind of query should work; just leave out the "FROM dummy_table" bit. > > (in Oracle it would be "FROM dual"). What's really fun is to insert extra records into "dual" on Oracle systems and watch all the applications crash :)
scott.marlowe wrote: > On Wed, 25 Jun 2003, Ian Barwick wrote: > >>On Wednesday 25 June 2003 21:37, Mike Mascari wrote: >> >>>I proposed that same solution 3 years ago. Tom shoots it down: >>> >>>http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=3A4D611 >>>6.1A613402%40mascari.com&rnum=1&prev=/groups%3Fq%3DMike%2BMascari%2BINSERT%2 >>>BNOT%2BEXISTS%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den >>> >>>Reuben must be prepared for unique key violation, I'm afraid. And, >>>despite the optimism in the link, we still don't have savepoints. :-( >> >>aha, useful to know. Thanks. > > Oh yeah, in my example you need to do a select for update to be race safe. > But if two simultaneous "selects for update" fail to find rows, both clients will then attempt the INSERT, which will cause one of them to abort due to a unique key violation. In these "replace" scenarios, the application must be prepared for the unique key violation with the current version of PostgreSQL. Mike Mascari mascarm@mascari.com
Reuben D. Budiardja wrote: >>Reuben must be prepared for unique key violation, I'm afraid. And, >>despite the optimism in the link, we still don't have savepoints. :-( > > Interesting reading of the archive. In my particular case here, I don't have > to worry too much about the race thing. But the inherent problem is still > there. > > Where can I read / learn more about the so-called "savepoints"? This is the > first time I've heard it. Granted, I've only been using postgresql recently. Here's a link to Oracle's implementation: http://www.engin.umich.edu/caen/wls/software/oracle/server.901/a90125/statements_102.htm#2091263 If PostgreSQL had SAVEPOINTs or even nested transactions, one could handle a unique key violation without having to resubmit the whole transaction. Eg: BEGIN; <- Outer Tx .. do a lot of work .. LOOP: status = ABORTED WHILE (status == ABORTED) { BEGIN; <- Nested Tx UPDATE foo SET val = 1 WHERE key = 0; IF zero rows updated { INSERT INTO foo... ^- Causes unique key violation, but only Nested Tx in ABORT state } status = TX state; END; } END; I believe there are hackers working on nested transactions for 7.5? Mike Mascari mascarm@mascari.com > > Thanks. > RDB >
And what might a future version of PostgreSQL, or current versions of other RDMBS's do to prevent that? Mike Mascari wrote: > scott.marlowe wrote: > >>On Wed, 25 Jun 2003, Ian Barwick wrote: >> >> >>>On Wednesday 25 June 2003 21:37, Mike Mascari wrote: >>> >>> >>>>I proposed that same solution 3 years ago. Tom shoots it down: >>>> >>>>http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&threadm=3A4D611 >>>>6.1A613402%40mascari.com&rnum=1&prev=/groups%3Fq%3DMike%2BMascari%2BINSERT%2 >>>>BNOT%2BEXISTS%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den >>>> >>>>Reuben must be prepared for unique key violation, I'm afraid. And, >>>>despite the optimism in the link, we still don't have savepoints. :-( >>> >>>aha, useful to know. Thanks. >> >>Oh yeah, in my example you need to do a select for update to be race safe. >> > > > But if two simultaneous "selects for update" fail to find rows, both > clients will then attempt the INSERT, which will cause one of them to > abort due to a unique key violation. In these "replace" scenarios, the > application must be prepared for the unique key violation with the > current version of PostgreSQL. > > Mike Mascari > mascarm@mascari.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >
> // check if entry already exists > SELECT COUNT(*) FROM tablename WHERE [cond] You may want to try this instead: select exists(select * from tablename where [cond]) which should be faster than count(*). You'll get back a boolean value, though, not an integer. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
That's why I resorted to "lock table", select, then insert/update. You have to block all the reads of other processes that are considering an insert. This is not great for performance, but I was certain it will work, unlike the race-vulnerable suggestions (are people here actually using those? Whoa!), or the "end up with everything rolled back". HOWEVER, IF (big if) all your applications and DB users can cooperate what you can do is lock a different dummy table e.g. ForInsertsOnTableX, and thus implement an application layer "select for insert". e.g. lock table ForInsertsOnTableA; select count(*) from TableA where field='foo'; if count=0 { insert into TableA ... } else if count >0 { update TableA .... where field='foo'; } else { D'oh.... Can count ever return negative or null? } That way other applications that are just doing selects but not "selects for inserts" don't run into the table locks, and performance doesn't go down as much (coz there's no way to unlock a table in postgresql other than aborting or committing a transaction, so if your transaction takes time...). AFAIK this method should work on most RDBMS. Now IF postgresql had a select for insert... But AFAIK that's nonstandard and requires discipline, but remembering to use select for update requires similar discipline too. (Related: I also suggested arbitrary user locks years back, but I wasn't able to implement them.) Summary: For postgresql if DB discipline is good and will remain good, you can use lock ForInsertsOnTableA, otherwise take the performance hit and lock TableA before select, insert/update. If there's a reason why this won't work, or there are better solutions, I'd sure like to know :). Regards, Link. At 04:23 PM 6/25/2003 -0400, Mike Mascari wrote: >But if two simultaneous "selects for update" fail to find rows, both >clients will then attempt the INSERT, which will cause one of them to >abort due to a unique key violation. In these "replace" scenarios, the >application must be prepared for the unique key violation with the >current version of PostgreSQL. > >Mike Mascari >mascarm@mascari.com > > >---------------------------(end of broadcast)--------------------------- >TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
Lincoln Yeoh <lyeoh@pop.jaring.my> writes: > (Related: I also suggested arbitrary user locks years back, but I wasn't > able to implement them.) Don't we have 'em already? See contrib/userlock/. regards, tom lane
At 10:05 AM 6/26/2003 -0400, Tom Lane wrote: >Lincoln Yeoh <lyeoh@pop.jaring.my> writes: > > (Related: I also suggested arbitrary user locks years back, but I wasn't > > able to implement them.) > >Don't we have 'em already? See contrib/userlock/. Kinda. The one I was thinking of was locking on an arbitrary string - which would allows application level insert lock of even finer granularity amongst other things. You can then actually lock on stuff you are trying to insert and it won't block other unrelated inserts. Now that I think of it, one could achieve something possibly close enough with the existing userlock, just lock on the first/last/hash 32 bits of the data you want to insert. Sure some inserts will clash, but it's still better than blocking all other inserts. The 16 bit group could be for the tables. D'oh. Sure took me a long while to realize this... <sheepish grin>. Link.
Karsten Hilbert wrote:
SELECT 1 FROM tablename WHERE [cond];
And use row_count > 0 to determine a hit.
And I can't think of a way to prevent the race-condition thing besides using SERIALIZABLE ISOLATION.
Why not do:// check if entry already exists SELECT COUNT(*) FROM tablename WHERE [cond]You may want to try this instead: select exists(select * from tablename where [cond]) which should be faster than count(*). You'll get back a boolean value, though, not an integer. Karsten
SELECT 1 FROM tablename WHERE [cond];
And use row_count > 0 to determine a hit.
And I can't think of a way to prevent the race-condition thing besides using SERIALIZABLE ISOLATION.
Because >>select exists(select * from tablename where [cond]) will return after finding 1 matching row (or am I wrong ?) while > SELECT 1 FROM tablename WHERE [cond]; will return a 1 for *each* matching row (just tested). Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
At 01:36 AM 6/28/2003 +0200, Karsten Hilbert wrote: >Because > > >>select exists(select * from tablename where [cond]) > >will return after finding 1 matching row (or am I >wrong ?) while > > > SELECT 1 FROM tablename WHERE [cond]; > >will return a 1 for *each* matching row (just tested). You can use limit. Postgresql is smart enough to stop once it has the necessary rows[1]. select 1 from table name where cond limit 1; Link. [1] Or was it necessary rows + 1? Agh, something wrong with my memory.
Lincoln Yeoh <lyeoh@pop.jaring.my> writes: > You can use limit. Postgresql is smart enough to stop once it has the > necessary rows[1]. > [1] Or was it necessary rows + 1? Agh, something wrong with my memory. 7.4 stops at exactly the LIMIT row count. Several prior releases fetch one more row internally ... which is hardly noticeable in most contexts, but sure enough we got complaints ... regards, tom lane