Thread: Hard problem with concurrency
OK, this is the problem: I want to write a bit of SQL that if a row exists in a table, then update it, otherwise it will update it. Problem is, there is a very high chance of simultaneous execute of this query on the same row (the rows have a unique index). So, strategy one: begin; update row; if (no rows affected) insert row; commit; Problem - race condition! If the two transactions run at the same time, the second will end up doing an insert on a unique row which will cause query failure Strategy two: begin; select row for update; if (row returned) update; else insert; commit; Problem - race condition. The row-level locking doesn't allow me to lock 'potential rows', so if the row does not yet exists and two transactions run simultaneously then the second with die with a unique violation; Strategy three: begin; lock table in exclusive mode; update row; if (no rows affected) insert row; commit; Problem - Works, but this table needs high concurrency. Every time a member hits a page of the site that needs authentication, this function is called. In particular, the login transaction can take a little time sometimes and we can't halt everyone else's activites for that duration... So what is the solution??? I'm not sure if acquiring a ROW EXCLUSIVE MODE lock will help at all. Also, I can't try the insert and then the update because the INSERT, in Postgres, will cause an outright transaction failure. What the heck is the solution?? Chris
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Solution one: use sequences for the unique value. Solution two: use another table to effect the exclusive locking and use it to store the "unique" values: begin; update row; if (no rows affected) { lock table foo in exclusive mode; find a unique value that is not already in foo store this valueinside of foo insert row; } commit; Solution three: use your strategy two, but throw a loop around it and have it try again (with a new value) if it gets a unique violation. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200302162143 -----BEGIN PGP SIGNATURE----- Comment: http://www.turnstep.com/pgp.html iD8DBQE+T0sFvJuQZxSWSsgRAvMbAJwNndfcRb8U+W4TCeSGMGg+j7CqMwCgpfbd 98bDZI1r5AOLv1iCyVTC/AI= =0Nkm -----END PGP SIGNATURE-----
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > I can't try the insert and then the update because the INSERT, in Postgres, > will cause an outright transaction failure. Do the update, then try to insert if the update found nothing, and put a retry loop around the whole transaction in case you fail because of concurrent inserts. Realistically you will need a retry loop in all but the most trivial cases anyway --- certainly so if you want to use serializable transaction mode. So I don't think this solution is unworkably complex. regards, tom lane
> Do the update, then try to insert if the update found nothing, and put > a retry loop around the whole transaction in case you fail because of > concurrent inserts. > > Realistically you will need a retry loop in all but the most trivial > cases anyway --- certainly so if you want to use serializable > transaction mode. So I don't think this solution is unworkably complex. I guess that will work, but it will not prevent our log from being spammed up with error messages. Also, this is a somewhat simplified case. Some transactions, such as our login transaction have rather large numbers of operations in them and we don't want to have to rollback the whole thing. I guess we'll have to live with it. REPLACE INTO anyone? ;) Chris
Hm, odd, nobody mentioned this solution: If you don't have a primary key already, create a unique index on the combination you want to be unique. Then: . Try to insert the record . If you get a duplicate key error then do update instead No possibilities of duplicate records due to race conditions. If two people try to insert/update at the same time you'll only get one of the two results, but that's the downside of the general approach you've taken. It's a tad inefficient if the usual case is updates, but certainly not less efficient than doing table locks. I'm not sure what you're implementing here. Depending on what it is you might consider having a table of raw data that you _only_ insert into. Then you process those results into a table with the consolidated data you're trying to gather. I've usually found that's more flexible later because then you have all the raw data in the database even if you only present a limited view. -- greg
> If you don't have a primary key already, create a unique index on the > combination you want to be unique. Then: > > . Try to insert the record > . If you get a duplicate key error > then do update instead > > No possibilities of duplicate records due to race conditions. If two people > try to insert/update at the same time you'll only get one of the two results, > but that's the downside of the general approach you've taken. It's a tad > inefficient if the usual case is updates, but certainly not less efficient > than doing table locks. The idea was to stop our postgres logs being spammed up with unique constraint violation warnings....in which case your solution above is identical to our current one. Update and if it fails, insert, except since the row is likely to already be there - our current way will be a bit more efficient. Chris
On Sun, Feb 16, 2003 at 23:51:49 -0500, Greg Stark <gsstark@mit.edu> wrote: > > Hm, odd, nobody mentioned this solution: > > If you don't have a primary key already, create a unique index on the > combination you want to be unique. Then: > > . Try to insert the record > . If you get a duplicate key error > then do update instead I think this solution depends on records not being deleted. Otherwise a record could be deleted between the insert attempt and the update attempt.
Christopher Kings-Lynne wrote: > > If you don't have a primary key already, create a unique index on the > > combination you want to be unique. Then: > > > > . Try to insert the record > > . If you get a duplicate key error > > then do update instead > > > > No possibilities of duplicate records due to race conditions. If two > people > > try to insert/update at the same time you'll only get one of the two > results, > > but that's the downside of the general approach you've taken. It's a tad > > inefficient if the usual case is updates, but certainly not less efficient > > than doing table locks. > > The idea was to stop our postgres logs being spammed up with unique > constraint violation warnings....in which case your solution above is > identical to our current one. Update and if it fails, insert, except since > the row is likely to already be there - our current way will be a bit more > efficient. To control the spamming, use server_min_messages before the INSERT. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Christopher Kings-Lynne writes: > REPLACE INTO anyone? ;) The upcoming SQL 200x standard includes a MERGE command that appears to fulfill that purpose. -- Peter Eisentraut peter_e@gmx.net
FWIW, that's the approach O*'s taking. http://otn.oracle.com/products/oracle9i/daily/Aug24.html -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Peter Eisentraut Sent: Tuesday, February 18, 2003 11:02 AM To: Christopher Kings-Lynne Cc: Tom Lane; Hackers Subject: Re: [HACKERS] Hard problem with concurrency Christopher Kings-Lynne writes: > REPLACE INTO anyone? ;) The upcoming SQL 200x standard includes a MERGE command that appears to fulfill that purpose. -- Peter Eisentraut peter_e@gmx.net ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Peter Eisentraut kirjutas T, 18.02.2003 kell 21:02: > Christopher Kings-Lynne writes: > > > REPLACE INTO anyone? ;) > > The upcoming SQL 200x standard includes a MERGE command that appears to > fulfill that purpose. Where is this upcoming standard available on net ? ---------------- Hannu
Hannu Krosing writes: > Where is this upcoming standard available on net ? Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD -- Peter Eisentraut peter_e@gmx.net
> > REPLACE INTO anyone? ;) > > The upcoming SQL 200x standard includes a MERGE command that appears to > fulfill that purpose. Is there somewhere that I can read that spec? Or can you just post the MERGE syntax for us? *sigh* It's just like a standard to come up with a totally new syntax for a feature that no-one has except MySQL who use a different syntax :) Thanks, Chris
URL added to develepers FAQ. --------------------------------------------------------------------------- Peter Eisentraut wrote: > Hannu Krosing writes: > > > Where is this upcoming standard available on net ? > > Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD > > -- > Peter Eisentraut peter_e@gmx.net > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Christopher Kings-Lynne wrote: > >*sigh* It's just like a standard to come up with a totally new syntax for a >feature that no-one has except MySQL who use a different syntax :) You sure? :) http://otn.oracle.com/products/oracle9i/daily/Aug24.html MERGE INTO SALES_FACT D USING SALES_JUL01 S ON (D.TIME_ID = S.TIME_ID AND D.STORE_ID = S.STORE_ID AND D.REGION_ID= S.REGION_ID) WHEN MATCHED THEN UPDATE SET d_parts = d_parts + s_parts, d_sales_amt = d_sales_amt+ s_sales_amt, d_tax_amt = d_tax_amt + s_tax_amt, d_discount = d_discount + s_discount WHENNOT MATCHED THEN INSERT (D.TIME_ID ,D.STORE_ID ,D.REGION_ID, D.PARTS ,D.SALES_AMT ,D.TAX_AMT ,D.DISCOUNT) VALUES ( S.TIME_ID ,S.STORE_ID ,S.REGION_ID, S.PARTS ,S.SALES_AMT ,S.TAX_AMT ,S.DISCOUNT); For those who last played with 8X, they have a couple of other new features in 9i. This is the best doc I saw talking about them. http://www.oracle-base.com/Articles/9i/SQLNewFeatures9i.asp
Christopher Kings-Lynne writes: > *sigh* It's just like a standard to come up with a totally new syntax for a > feature that no-one has except MySQL who use a different syntax :) Actually that command was copied straight out of Oracle. -- Peter Eisentraut peter_e@gmx.net
On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote: > Christopher Kings-Lynne writes: > > > REPLACE INTO anyone? ;) > > The upcoming SQL 200x standard includes a MERGE command that appears to > fulfill that purpose. > MySQL features a poor-mans aproach to this problem, their REPLACE command: http://www.mysql.com/doc/en/REPLACE.html REPLACE works exactly like INSERT, except that if an old record in the table has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old record is deleted before the new record is inserted. I'd love to see this kind of functionality in PG, I've got a database that caches data which only gets conditional INSERT/UPDATEs, so that would save a lot of wasted SQL commands. Vincent van Leeuwen Media Design
Hi, Vincent van Leeuwen wrote, On 2/19/2003 10:08 PM: > On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote: >> Christopher Kings-Lynne writes: >> >> > REPLACE INTO anyone? ;) >> >> The upcoming SQL 200x standard includes a MERGE command that appears to >> fulfill that purpose. >> > > MySQL features a poor-mans aproach to this problem, their REPLACE command: > > http://www.mysql.com/doc/en/REPLACE.html > REPLACE works exactly like INSERT, except that if an old record in the table > has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old > record is deleted before the new record is inserted. > > I'd love to see this kind of functionality in PG, I've got a database that > caches data which only gets conditional INSERT/UPDATEs, so that would save a > lot of wasted SQL commands. I think this replace function is stupid in mysql. It deletes the the row, and what if that row is linked into another table? You loose your connection, relation. However you can easy write a procedure which can make a real replace, cause it checks if same data (by keys) is in the table then makes an update, if not, do an insert. You can do everything, not like in mysql, just write it as you like. C.
On Mon, 17 Feb 2003 09:51:54 +0800, "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> wrote: >Strategy three: > >begin; >lock table in exclusive mode; >update row; >if (no rows affected) insert row; >commit; > >Problem - Works, but this table needs high concurrency. Chris, distributing congestion might improve that. Instead of locking the whole table just lock one row in a dummy table: CREATE TABLE dummylock(id INT PRIMARY KEY);INSERT INTO dummylock VALUES (0);INSERT INTO dummylock VALUES (1);...INSERT INTOdummylock VALUES (999); Create an immutable function lockhash(<type of PK>) returning a value between 0 and 999. BEGIN;SET TRANSACTION ISOLATION LEVEL READ COMMITTED;UPDATE t SET c = 'newval' WHERE pk = 'pk';if (no rows affected) THEN SELECT * FROM dummylock WHERE id = lockhash('pk') FOR UPDATE; -- try again UPDATE t SET c = 'newval'WHERE pk = 'pk'; if (no rows affected) THEN INSERT INTO t ...; END IF;END IF;COMMIT; This is just an idea. Completely untested ... ServusManfred