Thread: serialization errors when inserting new records

serialization errors when inserting new records

From
Ralph van Etten
Date:
Hoi,

I searched the archives but couldn't find an answer to this:

I have a table (simplyfied)

CREATE TABLE test (
  id   INT PRIMARY KEY,
  name VARCHAR(250)
);

I insert records with

INSERT INTO test (id, name)
SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test

Ofcourse this gives problems when two clients are inserting a record at
the same time. (duplicate primary keys) But, i can't use a sequence in my
application (the pk consists of more than just a sequence)

one solution would be to do a  'LOCK TABLE test IN SHARE MODE' before
inserting. This solves my problem but i'm not sure if its the
best way to deal with this kind of concurrency problems ? Is there a
better way ?



Thanks in advance.

Ralph.






Re: serialization errors when inserting new records

From
Tino Wildenhain
Date:
Am Samstag, den 22.01.2005, 11:14 +0100 schrieb Ralph van Etten:
> Hoi,
>
> I searched the archives but couldn't find an answer to this:
>
> I have a table (simplyfied)
>
> CREATE TABLE test (
>   id   INT PRIMARY KEY,
>   name VARCHAR(250)
> );
>
> I insert records with
>
> INSERT INTO test (id, name)
> SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test
>
> Ofcourse this gives problems when two clients are inserting a record at
> the same time. (duplicate primary keys) But, i can't use a sequence in my
> application (the pk consists of more than just a sequence)
>
> one solution would be to do a  'LOCK TABLE test IN SHARE MODE' before
> inserting. This solves my problem but i'm not sure if its the
> best way to deal with this kind of concurrency problems ? Is there a
> better way ?

Of course. The solution to this problem is inherent to a good database
and has therefore been done long long ago ;)

See: http://borg.postgresql.org/docs/7.4/static/datatype.html#DATATYPE-SERIAL

Regards
Tino


Re: serialization errors when inserting new records

From
Gary Doades
Date:
Ralph van Etten wrote:
> Hoi,
>
> I searched the archives but couldn't find an answer to this:
>
> I have a table (simplyfied)
>
> CREATE TABLE test (
>   id   INT PRIMARY KEY,
>   name VARCHAR(250)
> );
>
> I insert records with
>
> INSERT INTO test (id, name)
> SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test
>
> Ofcourse this gives problems when two clients are inserting a record at
> the same time. (duplicate primary keys) But, i can't use a sequence in my
> application (the pk consists of more than just a sequence)
>

It's not clear why you can't use a serial as the primary key or as part
of the primary key. From your example it looks like you are trying to do
exactly that.

What does your *real* primary key consist of?

Cheers,
Gary.

Re: serialization errors when inserting new records

From
Ralph van Etten
Date:
On Sat, 22 Jan 2005, Tino Wildenhain wrote:

> > INSERT INTO test (id, name)
> > SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test
> >
> > Ofcourse this gives problems when two clients are inserting a record at
> > the same time. (duplicate primary keys) But, i can't use a sequence in my
> > application (the pk consists of more than just a sequence)
> >
> > one solution would be to do a  'LOCK TABLE test IN SHARE MODE' before
> > inserting. This solves my problem but i'm not sure if its the
> > best way to deal with this kind of concurrency problems ? Is there a
> > better way ?
>
> Of course. The solution to this problem is inherent to a good database
> and has therefore been done long long ago ;)

I agree that a serial would be better.

But I think there are situations where a serial isn't convenient
Like when you want an primary key which consists of the current
year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc.
With a sequence you must write extra code to reset the sequence every year
and you get into trouble if someone inserts data from the previous year...
A 'MAX(id)+1' is much simpler and cleaner then.

Ralph.





Re: serialization errors when inserting new records

From
Bruno Wolff III
Date:
On Sun, Jan 23, 2005 at 10:23:50 +0100,
  Ralph van Etten <ralph@et10.org> wrote:
> But I think there are situations where a serial isn't convenient
> Like when you want an primary key which consists of the current
> year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc.
> With a sequence you must write extra code to reset the sequence every year
> and you get into trouble if someone inserts data from the previous year...
> A 'MAX(id)+1' is much simpler and cleaner then.

You don't normally want to do that either. You should use one sequence for all
of the years.

Re: serialization errors when inserting new records

From
William Yu
Date:
Ralph van Etten wrote:
> I agree that a serial would be better.
>
> But I think there are situations where a serial isn't convenient
> Like when you want an primary key which consists of the current
> year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc.
> With a sequence you must write extra code to reset the sequence every year
> and you get into trouble if someone inserts data from the previous year...
> A 'MAX(id)+1' is much simpler and cleaner then.

My personal experience is trying to get primary keys to "mean" something
is a pain in the ass. In your example, I'd much rather stick with serial
as the primary key and store the year/sequence in another "display"
field. Think about this situation:

1) records 5/1 to 5/2000 are loaded into a table
2) oops, you made a mistake ... another 200 records should have been
shoehorned @ 5/20
3) ugh...you now have to add +200 to records 5/20 to 5/2000 -- and you
have to do it 1 record at a time in reverse order. (trying to do an
update via a single command will immediately produce a dup key violation
--> 5/20+200 = 5/220 which already exists)
4) double ugh...you've got FK already pointed to those records -- now
you have to drop the FKs first, update those tables, do step #3 again,
recreate the FKs.

On the otherhand, if you just used arbitary numbers, you could update
5/20 to 5/2000 with a single command, load the correct 5/20-5/220
records in and voila.

Re: serialization errors when inserting new records

From
Ralph van Etten
Date:
On Sun, 23 Jan 2005, William Yu wrote:

> Ralph van Etten wrote:
> > I agree that a serial would be better.
> >
> > But I think there are situations where a serial isn't convenient
> > Like when you want an primary key which consists of the current
> > year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc.
> > With a sequence you must write extra code to reset the sequence every year
> > and you get into trouble if someone inserts data from the previous year...
> > A 'MAX(id)+1' is much simpler and cleaner then.
>
> My personal experience is trying to get primary keys to "mean" something
> is a pain in the ass. In your example, I'd much rather stick with serial
> as the primary key and store the year/sequence in another "display"
> field. Think about this situation:

Ok, but then I have the same problem. The "display" field has to be
unique and it needs to use a SELECT MAX in an INSERT query and this gives
the same problem. Only not with the primary key but with possible
duplicate values in the "display" field.

CREATE TABLE test (
  pk    SERIAL,
  disp1 CHAR(2),
  disp2 INT,

  UNIQUE(disp1, disp2),
  PRIMARY KEY(pk)
)

and insert with something like:

INSERT INTO test (disp1, disp2)
SELECT 'XX', COALESCE(MAX(disp2)+1, 1)
  FROM test WHERE disp1 = 'XX';

This gives the same serialization errors.

Ralph.



Re: serialization errors when inserting new records

From
William Yu
Date:
Ralph van Etten wrote:
> Ok, but then I have the same problem. The "display" field has to be
> unique and it needs to use a SELECT MAX in an INSERT query and this gives
> the same problem. Only not with the primary key but with possible
> duplicate values in the "display" field.

If you absolutely need the display field to be unique and it has to be a
sequence starting from 1 -- it's simplicity itself. Use the
serialization errors to your advantage. If you try to insert a record
and you get a dupe key violation, just increment your seq counter by 1
and try again. Keep trying until you succeed. No need to lock the tables
at all.

If the display field is just an arbitrary reference number that doesn't
require 1..n, you've got even more ways of making it work. Make it a
compound field consisting of something that's impossible do duplicate --
the server's process IDs. Since I primarily use perl, I myself use the
perl process ID but I'm pretty sure you can get the connection's process
ID from querying the information schema. Ie, postmaster pid 5673  can
only run 1 statement at a time so if you insert 05-5673-15, you are
guaranteed success.

Re: serialization errors when inserting new records

From
Ralph van Etten
Date:
On Mon, 24 Jan 2005, William Yu wrote:

> If you absolutely need the display field to be unique and it has to be a
> sequence starting from 1 -- it's simplicity itself. Use the
> serialization errors to your advantage. If you try to insert a record
> and you get a dupe key violation, just increment your seq counter by 1
> and try again. Keep trying until you succeed. No need to lock the tables
> at all.

Just trial and error ?
I don't think its a very good solution. Then you have to figure out
exactly which error it is and which column caused the violation. (How ? By
parsing the error string ???)
And if the correct error was returned you retry the query... But how many
times ? Just once ? or a 1000 times ?
I'm sorry but I think this solution is just a Q&D hack.