Re: Revisited: Transactions, insert unique. - Mailing list pgsql-general

From Lincoln Yeoh
Subject Re: Revisited: Transactions, insert unique.
Date
Msg-id 3.0.5.32.20000426102701.009037a0@pop.mecomb.po.my
Whole thread Raw
In response to Re: Revisited: Transactions, insert unique.  (Ed Loehr <eloehr@austin.rr.com>)
Responses Re: Revisited: Transactions, insert unique.
List pgsql-general
At 10:58 AM 24-04-2000 -0500, Ed Loehr wrote:
>Good point.  And you can combine the check David suggests with the insert
>statement, e.g.,
>
>    INSERT INTO mytable (id, ...)
>        SELECT 7, ...
>        FROM mytable
>        WHERE NOT EXISTS (SELECT * FROM mytable WHERE id = 7)
>
>And then check the return result for number of rows inserted.  '0' means an
>update is needed.  I don't remember if there is cleaner more efficient
>manner for doing that, but probably so...

At first it looked like it might work, but it doesn't ensure uniqueness
when I checked. And after that I realised the situation was even more
difficult.

e.g.
create table test (a int, b text);
Transaction A
begin;
insert into test (a,b) select 4,'four' from test
where not exists (select * from test where a=4);

Transaction B
begin;
insert into test (a,b) select 4,'four' from test
where not exists (select * from test where a=4);

Then you do a commit on both, and you end up with two rows.

The reason this happens is because inserted stuff in one transaction is not
visible to other transactions until a commit. So even David's suggestion
won't work, nor will a "select for update", since with the current
implementation there is nothing to lock on till the commit.

This is correct behaviour, but that means we have to try something else to
ensure unique rows. If we resort to using UNIQUE then we'll have to redo
whole transactions till the error point. If we use the proposed arbitrary
lock, it's a lot more work for the developers plus it shifts so much more
responsibility to the application compared to stuff like select for update.
I kind of like the arbitrary lock thingy, but I'm not in a "large"
environment.

For large environments maybe there should be a select for insert? Or the
behaviour for select for update should change?

That is to mean that the statement blocks if there's any other prior select
for insert with a potentially interfering WHERE clause (ouch!), even if
there are NO relevant ROWS to return (ouch!), . It does not even begin to
fetch rows till after the other transactions are completed.

This doesn't look easy to do. But it's one of the ways to maintain the
Postgresql "no error transaction" behaviour.

To maintain this behaviour there need to be a few more ways to check for
stuff (and lock the state) before attempting to change or create stuff.
Stuff like create table and so on.

Aside: would an arbitrary lock mechanism be better for performance than
using a UNIQUE on a table? In most cases you're inserting to an indexed
table, so having a UNIQUE isn't going to cost much more right?

Cheerio,
Link.


pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: unique row identifier data type exhausted . . .
Next
From: Mike Mascari
Date:
Subject: Re: