Thread: Hard problem with concurrency

Hard problem with concurrency

From
"Christopher Kings-Lynne"
Date:
OK, this is the problem:

I want to write a bit of SQL that if a row exists in a table, then update
it, otherwise it will update it.  Problem is, there is a very high chance of
simultaneous execute of this query on the same row (the rows have a unique
index).

So, strategy one:

begin;
update row;
if (no rows affected) insert row;
commit;

Problem - race condition!  If the two transactions run at the same time, the
second will end up doing an insert on a unique row which will cause query
failure

Strategy two:

begin;
select row for update;
if (row returned) update;
else insert;
commit;

Problem - race condition.  The row-level locking doesn't allow me to lock
'potential rows', so if the row does not yet exists and two transactions run
simultaneously then the second with die with a unique violation;

Strategy three:

begin;
lock table in exclusive mode;
update row;
if (no rows affected) insert row;
commit;

Problem - Works, but this table needs high concurrency.  Every time a member
hits a page of the site that needs authentication, this function is called.
In particular, the login transaction can take a little time sometimes and we
can't halt everyone else's activites for that duration...

So what is the solution???

I'm not sure if acquiring a ROW EXCLUSIVE MODE lock will help at all.  Also,
I can't try the insert and then the update because the INSERT, in Postgres,
will cause an outright transaction failure.

What the heck is the solution??

Chris




Re: Hard problem with concurrency

From
greg@turnstep.com
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Solution one: use sequences for the unique value.

Solution two: use another table to effect the exclusive locking 
and use it to store the "unique" values:

begin;
update row;
if (no rows affected) { lock table foo in exclusive mode; find a unique value that is not already in foo store this
valueinside of foo insert row;
 
}
commit;

Solution three: use your strategy two, but throw a loop around it and have 
it try again (with a new value) if it gets a unique violation.


- --
Greg Sabino Mullane  greg@turnstep.com
PGP Key: 0x14964AC8 200302162143

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+T0sFvJuQZxSWSsgRAvMbAJwNndfcRb8U+W4TCeSGMGg+j7CqMwCgpfbd
98bDZI1r5AOLv1iCyVTC/AI=
=0Nkm
-----END PGP SIGNATURE-----




Re: Hard problem with concurrency

From
Tom Lane
Date:
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> I can't try the insert and then the update because the INSERT, in Postgres,
> will cause an outright transaction failure.

Do the update, then try to insert if the update found nothing, and put
a retry loop around the whole transaction in case you fail because of
concurrent inserts.

Realistically you will need a retry loop in all but the most trivial
cases anyway --- certainly so if you want to use serializable
transaction mode.  So I don't think this solution is unworkably complex.
        regards, tom lane


Re: Hard problem with concurrency

From
"Christopher Kings-Lynne"
Date:
> Do the update, then try to insert if the update found nothing, and put
> a retry loop around the whole transaction in case you fail because of
> concurrent inserts.
>
> Realistically you will need a retry loop in all but the most trivial
> cases anyway --- certainly so if you want to use serializable
> transaction mode.  So I don't think this solution is unworkably complex.

I guess that will work, but it will not prevent our log from being spammed
up with error messages.  Also, this is a somewhat simplified case.  Some
transactions, such as our login transaction have rather large numbers of
operations in them and we don't want to have to rollback the whole thing.
I guess we'll have to live with it.

REPLACE INTO anyone? ;)

Chris




Re: Hard problem with concurrency

From
Greg Stark
Date:
Hm, odd, nobody mentioned this solution:

If you don't have a primary key already, create a unique index on the
combination you want to be unique. Then:

. Try to insert the record
. If you get a duplicate key error then do update instead

No possibilities of duplicate records due to race conditions. If two people
try to insert/update at the same time you'll only get one of the two results,
but that's the downside of the general approach you've taken. It's a tad
inefficient if the usual case is updates, but certainly not less efficient
than doing table locks.

I'm not sure what you're implementing here. Depending on what it is you might
consider having a table of raw data that you _only_ insert into. Then you
process those results into a table with the consolidated data you're trying to
gather. I've usually found that's more flexible later because then you have
all the raw data in the database even if you only present a limited view.

-- 
greg



Re: Hard problem with concurrency

From
"Christopher Kings-Lynne"
Date:
> If you don't have a primary key already, create a unique index on the
> combination you want to be unique. Then:
>
> . Try to insert the record
> . If you get a duplicate key error
>   then do update instead
>
> No possibilities of duplicate records due to race conditions. If two
people
> try to insert/update at the same time you'll only get one of the two
results,
> but that's the downside of the general approach you've taken. It's a tad
> inefficient if the usual case is updates, but certainly not less efficient
> than doing table locks.

The idea was to stop our postgres logs being spammed up with unique
constraint violation warnings....in which case your solution above is
identical to our current one.  Update and if it fails, insert, except since
the row is likely to already be there - our current way will be a bit more
efficient.

Chris



Re: Hard problem with concurrency

From
Bruno Wolff III
Date:
On Sun, Feb 16, 2003 at 23:51:49 -0500, Greg Stark <gsstark@mit.edu> wrote:
> 
> Hm, odd, nobody mentioned this solution:
> 
> If you don't have a primary key already, create a unique index on the
> combination you want to be unique. Then:
> 
> . Try to insert the record
> . If you get a duplicate key error
>   then do update instead

I think this solution depends on records not being deleted. Otherwise
a record could be deleted between the insert attempt and the update attempt.


Re: Hard problem with concurrency

From
Bruce Momjian
Date:
Christopher Kings-Lynne wrote:
> > If you don't have a primary key already, create a unique index on the
> > combination you want to be unique. Then:
> >
> > . Try to insert the record
> > . If you get a duplicate key error
> >   then do update instead
> >
> > No possibilities of duplicate records due to race conditions. If two
> people
> > try to insert/update at the same time you'll only get one of the two
> results,
> > but that's the downside of the general approach you've taken. It's a tad
> > inefficient if the usual case is updates, but certainly not less efficient
> > than doing table locks.
> 
> The idea was to stop our postgres logs being spammed up with unique
> constraint violation warnings....in which case your solution above is
> identical to our current one.  Update and if it fails, insert, except since
> the row is likely to already be there - our current way will be a bit more
> efficient.

To control the spamming, use server_min_messages before the INSERT.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Hard problem with concurrency

From
Peter Eisentraut
Date:
Christopher Kings-Lynne writes:

> REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Hard problem with concurrency

From
"Ron Mayer"
Date:
FWIW, that's the approach O*'s taking.  http://otn.oracle.com/products/oracle9i/daily/Aug24.html

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Peter Eisentraut
Sent: Tuesday, February 18, 2003 11:02 AM
To: Christopher Kings-Lynne
Cc: Tom Lane; Hackers
Subject: Re: [HACKERS] Hard problem with concurrency 


Christopher Kings-Lynne writes:

> REPLACE INTO anyone? ;)

The upcoming SQL 200x standard includes a MERGE command that appears to
fulfill that purpose.

-- 
Peter Eisentraut   peter_e@gmx.net


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: Hard problem with concurrency

From
Hannu Krosing
Date:
Peter Eisentraut kirjutas T, 18.02.2003 kell 21:02:
> Christopher Kings-Lynne writes:
> 
> > REPLACE INTO anyone? ;)
> 
> The upcoming SQL 200x standard includes a MERGE command that appears to
> fulfill that purpose.

Where is this upcoming standard available on net ?

----------------
Hannu



Re: Hard problem with concurrency

From
Peter Eisentraut
Date:
Hannu Krosing writes:

> Where is this upcoming standard available on net ?

Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Hard problem with concurrency

From
"Christopher Kings-Lynne"
Date:
> > REPLACE INTO anyone? ;)
>
> The upcoming SQL 200x standard includes a MERGE command that appears to
> fulfill that purpose.

Is there somewhere that I can read that spec?

Or can you just post the MERGE syntax for us?

*sigh* It's just like a standard to come up with a totally new syntax for a
feature that no-one has except MySQL who use a different syntax :)

Thanks,

Chris



Re: Hard problem with concurrency

From
Bruce Momjian
Date:
URL added to develepers FAQ.

---------------------------------------------------------------------------

Peter Eisentraut wrote:
> Hannu Krosing writes:
> 
> > Where is this upcoming standard available on net ?
> 
> Near ftp://sqlstandards.org/SC32/WG3/Progression_Documents/FCD
> 
> -- 
> Peter Eisentraut   peter_e@gmx.net
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Hard problem with concurrency

From
"Ron Mayer"
Date:
Christopher Kings-Lynne wrote:
>
>*sigh* It's just like a standard to come up with a totally new syntax for a
>feature that no-one has except MySQL who use a different syntax :)

You sure? :)
  http://otn.oracle.com/products/oracle9i/daily/Aug24.html
MERGE INTO SALES_FACT D    USING SALES_JUL01 S    ON (D.TIME_ID = S.TIME_ID       AND D.STORE_ID = S.STORE_ID       AND
D.REGION_ID= S.REGION_ID)    WHEN MATCHED THEN   UPDATE    SET d_parts = d_parts + s_parts,         d_sales_amt =
d_sales_amt+ s_sales_amt,         d_tax_amt = d_tax_amt + s_tax_amt,         d_discount = d_discount + s_discount
WHENNOT MATCHED THEN    INSERT (D.TIME_ID ,D.STORE_ID ,D.REGION_ID,       D.PARTS ,D.SALES_AMT ,D.TAX_AMT ,D.DISCOUNT)
 VALUES (       S.TIME_ID ,S.STORE_ID ,S.REGION_ID,       S.PARTS ,S.SALES_AMT ,S.TAX_AMT ,S.DISCOUNT);
 

For those who last played with 8X, they have a couple of other
new features in 9i.  This is the best doc I saw talking about them.
http://www.oracle-base.com/Articles/9i/SQLNewFeatures9i.asp




Re: Hard problem with concurrency

From
Peter Eisentraut
Date:
Christopher Kings-Lynne writes:

> *sigh* It's just like a standard to come up with a totally new syntax for a
> feature that no-one has except MySQL who use a different syntax :)

Actually that command was copied straight out of Oracle.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Hard problem with concurrency

From
Vincent van Leeuwen
Date:
On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote:
> Christopher Kings-Lynne writes:
> 
> > REPLACE INTO anyone? ;)
> 
> The upcoming SQL 200x standard includes a MERGE command that appears to
> fulfill that purpose.
> 

MySQL features a poor-mans aproach to this problem, their REPLACE command:

http://www.mysql.com/doc/en/REPLACE.html
REPLACE works exactly like INSERT, except that if an old record in the table
has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old
record is deleted before the new record is inserted.

I'd love to see this kind of functionality in PG, I've got a database that
caches data which only gets conditional INSERT/UPDATEs, so that would save a
lot of wasted SQL commands.


Vincent van Leeuwen
Media Design



Re: Hard problem with concurrency

From
CoL
Date:
Hi,

Vincent van Leeuwen wrote, On 2/19/2003 10:08 PM:
> On 2003-02-18 20:02:29 +0100, Peter Eisentraut wrote:
>> Christopher Kings-Lynne writes:
>> 
>> > REPLACE INTO anyone? ;)
>> 
>> The upcoming SQL 200x standard includes a MERGE command that appears to
>> fulfill that purpose.
>> 
> 
> MySQL features a poor-mans aproach to this problem, their REPLACE command:
> 
> http://www.mysql.com/doc/en/REPLACE.html
> REPLACE works exactly like INSERT, except that if an old record in the table
> has the same value as a new record on a UNIQUE index or PRIMARY KEY, the old
> record is deleted before the new record is inserted.
> 
> I'd love to see this kind of functionality in PG, I've got a database that
> caches data which only gets conditional INSERT/UPDATEs, so that would save a
> lot of wasted SQL commands.

I think this replace function is stupid in mysql. It deletes the the 
row, and what if that row is linked into another table? You loose your 
connection, relation.
However you can easy write a procedure which can make a real replace, 
cause it checks if same data (by keys) is in the table then makes an 
update, if not, do an insert.

You can do everything, not like in mysql, just write it as you like.

C.



Re: Hard problem with concurrency

From
Manfred Koizar
Date:
On Mon, 17 Feb 2003 09:51:54 +0800, "Christopher Kings-Lynne"
<chriskl@familyhealth.com.au> wrote:
>Strategy three:
>
>begin;
>lock table in exclusive mode;
>update row;
>if (no rows affected) insert row;
>commit;
>
>Problem - Works, but this table needs high concurrency.

Chris,

distributing congestion might improve that.  Instead of locking the
whole table just lock one row in a dummy table:
CREATE TABLE dummylock(id INT PRIMARY KEY);INSERT INTO dummylock VALUES (0);INSERT INTO dummylock VALUES (1);...INSERT
INTOdummylock VALUES (999);
 

Create an immutable function lockhash(<type of PK>) returning a value
between 0 and 999.
BEGIN;SET TRANSACTION ISOLATION LEVEL READ COMMITTED;UPDATE t SET c = 'newval' WHERE pk = 'pk';if (no rows affected)
THEN   SELECT * FROM dummylock            WHERE id = lockhash('pk') FOR UPDATE;    -- try again    UPDATE t SET c =
'newval'WHERE pk = 'pk';    if (no rows affected) THEN        INSERT INTO t ...;    END IF;END IF;COMMIT;
 

This is just an idea.  Completely untested ...

ServusManfred