Thread: Solution to UPDATE...INSERT problem

Solution to UPDATE...INSERT problem

From

"Christopher Kings-Lynne"

Date:

26 March 2003, 20:55:27

Hi Guys,

I just thought I'd share with you guys a very clever solution to the old
'update row.  if no rows affected, then insert the row' race condition
problem.  A guy at my work came up with it.

We were discussing this earlier on -hackers, but no-one could find a
solution that didn't involve locking the entire table around the
update...insert commands.

The problem is that sometimes the row will be inserted by another process
between your update and insert, causing your insert to fail with a unique
constraint violation.

So, say this is the insert:

INSERT INTO table VALUES (1, 'foo');  // 1 is in the primary key column

Rewrite it like this:

INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
pkcol=1;

See? So now that INSERT statement will insert the row if it doesn't exist,
or insert zero rows if it does.  You are then guaranteed that your
transaction will not fail and rollback, so you can repeat your update, or do
the insert first and then the update, etc.

Hope that's handy for people,

Chris

Re: Solution to UPDATE...INSERT problem

From

Tom Lane

Date:

27 March 2003, 01:41:41

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
> pkcol=1;

> See? So now that INSERT statement will insert the row if it doesn't exist,
> or insert zero rows if it does.  You are then guaranteed that your
> transaction will not fail and rollback, so you can repeat your update, or do
> the insert first and then the update, etc.

Uh, why exactly do you think this is race-free?

It looks fancy, but AFAICS the SELECT will return info that is correct
as of its starting timestamp; which is not enough to guarantee that the
INSERT won't conflict with another transaction doing the same thing
concurrently.

            regards, tom lane

Re: Solution to UPDATE...INSERT problem

From

"Christopher Kings-Lynne"

Date:

27 March 2003, 01:55:46

> Uh, why exactly do you think this is race-free?
>
> It looks fancy, but AFAICS the SELECT will return info that is correct
> as of its starting timestamp; which is not enough to guarantee that the
> INSERT won't conflict with another transaction doing the same thing
> concurrently.

How about:

INSERT INTO table SELECT 1, 'foo' WHERE NOT EXISTS (SELECT TRUE FROM table
WHERE pkcol=1 FOR UPDATE);

It's a lot more straightforward and has a FOR UPDATE.  Can this still cause
unique constraint failures?

Chris

Re: Solution to UPDATE...INSERT problem

From

Tom Lane

Date:

27 March 2003, 02:13:38

"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
>> Uh, why exactly do you think this is race-free?

> How about:

> INSERT INTO table SELECT 1, 'foo' WHERE NOT EXISTS (SELECT TRUE FROM table
> WHERE pkcol=1 FOR UPDATE);

> It's a lot more straightforward and has a FOR UPDATE.  Can this still cause
> unique constraint failures?

Certainly.  FOR UPDATE locks an existing row; it cannot lock the
condition of non-existence of a row.
        regards, tom lane

Re: Solution to UPDATE...INSERT problem

From

Haroldo Stenger

Date:

27 March 2003, 16:42:45

El jue, 27-03-2003 a las 03:41, Tom Lane escribió:
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes:
> > INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
> > pkcol=1;
>
> > See? So now that INSERT statement will insert the row if it doesn't exist,
> > or insert zero rows if it does.  You are then guaranteed that your
> > transaction will not fail and rollback, so you can repeat your update, or do
> > the insert first and then the update, etc.
>
> Uh, why exactly do you think this is race-free?
>
> It looks fancy, but AFAICS the SELECT will return info that is correct
> as of its starting timestamp; which is not enough to guarantee that the
> INSERT won't conflict with another transaction doing the same thing
> concurrently.

This approach certainly reduces significantly the time span within which
a race could occur, compared to, say, using two separate statements, or
worse, two statements in two consecutive transactions. But race
conditions either exist or they don't, so you're right.

Now, up to my knowledge this problem was only a an intractable one in
PostgreSQL because of transactions going into abort state once a unique
restriction violation happened. If savepoints/nested transactions were
there, one would simply put the insert within a protected area, and
retry as much as needed.

That's my bet why other databases don't seem to have a problem with this
one. Am I right? Or do they have some magic solution other than locking
the whole table?

Regards
Haroldo

Re: [GENERAL] Solution to UPDATE...INSERT problem

From

Lincoln Yeoh

Date:

03 April 2003, 12:12:05

AFAIK the "except" select won't see other inserts in uncommitted
transactions. If those transactions are committed you will end up with the
same problem. You can try it yourself, by manually doing two separate
transactions in psql.

You either have to lock the whole table, or lock at the application layer.
Some time back I suggested a "lock on arbitrary string" feature for
postgresql for this and various other purposes, but that feature probably
wouldn't scale in terms of management (it requires 100% cooperation amongst
all apps/clients involved).

There's no "select * from table where pkey=x for insert;" which would block
on uncommitted inserts/updates of pkey=x and other selects for insert/update.

In contrast "select ... for update" blocks on committed stuff.

Regards,
Link.

At 09:55 AM 3/27/03 +0800, Christopher Kings-Lynne wrote:

>Hi Guys,
>
>I just thought I'd share with you guys a very clever solution to the old
>'update row.  if no rows affected, then insert the row' race condition
>problem.  A guy at my work came up with it.
>
>We were discussing this earlier on -hackers, but no-one could find a
>solution that didn't involve locking the entire table around the
>update...insert commands.
>
>The problem is that sometimes the row will be inserted by another process
>between your update and insert, causing your insert to fail with a unique
>constraint violation.
>
>So, say this is the insert:
>
>INSERT INTO table VALUES (1, 'foo');  // 1 is in the primary key column
>
>Rewrite it like this:
>
>INSERT INTO table SELECT 1, 'foo' EXCEPT SELECT 1, 'foo' FROM table WHERE
>pkcol=1;
>
>See? So now that INSERT statement will insert the row if it doesn't exist,
>or insert zero rows if it does.  You are then guaranteed that your
>transaction will not fail and rollback, so you can repeat your update, or do
>the insert first and then the update, etc.
>
>Hope that's handy for people,
>
>Chris
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org