Thread: serialization errors when inserting new records
Hoi, I searched the archives but couldn't find an answer to this: I have a table (simplyfied) CREATE TABLE test ( id INT PRIMARY KEY, name VARCHAR(250) ); I insert records with INSERT INTO test (id, name) SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test Ofcourse this gives problems when two clients are inserting a record at the same time. (duplicate primary keys) But, i can't use a sequence in my application (the pk consists of more than just a sequence) one solution would be to do a 'LOCK TABLE test IN SHARE MODE' before inserting. This solves my problem but i'm not sure if its the best way to deal with this kind of concurrency problems ? Is there a better way ? Thanks in advance. Ralph.
Am Samstag, den 22.01.2005, 11:14 +0100 schrieb Ralph van Etten: > Hoi, > > I searched the archives but couldn't find an answer to this: > > I have a table (simplyfied) > > CREATE TABLE test ( > id INT PRIMARY KEY, > name VARCHAR(250) > ); > > I insert records with > > INSERT INTO test (id, name) > SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test > > Ofcourse this gives problems when two clients are inserting a record at > the same time. (duplicate primary keys) But, i can't use a sequence in my > application (the pk consists of more than just a sequence) > > one solution would be to do a 'LOCK TABLE test IN SHARE MODE' before > inserting. This solves my problem but i'm not sure if its the > best way to deal with this kind of concurrency problems ? Is there a > better way ? Of course. The solution to this problem is inherent to a good database and has therefore been done long long ago ;) See: http://borg.postgresql.org/docs/7.4/static/datatype.html#DATATYPE-SERIAL Regards Tino
Ralph van Etten wrote: > Hoi, > > I searched the archives but couldn't find an answer to this: > > I have a table (simplyfied) > > CREATE TABLE test ( > id INT PRIMARY KEY, > name VARCHAR(250) > ); > > I insert records with > > INSERT INTO test (id, name) > SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test > > Ofcourse this gives problems when two clients are inserting a record at > the same time. (duplicate primary keys) But, i can't use a sequence in my > application (the pk consists of more than just a sequence) > It's not clear why you can't use a serial as the primary key or as part of the primary key. From your example it looks like you are trying to do exactly that. What does your *real* primary key consist of? Cheers, Gary.
On Sat, 22 Jan 2005, Tino Wildenhain wrote: > > INSERT INTO test (id, name) > > SELECT COALESCE(MAX(id)+1, 1), 'name' FROM test > > > > Ofcourse this gives problems when two clients are inserting a record at > > the same time. (duplicate primary keys) But, i can't use a sequence in my > > application (the pk consists of more than just a sequence) > > > > one solution would be to do a 'LOCK TABLE test IN SHARE MODE' before > > inserting. This solves my problem but i'm not sure if its the > > best way to deal with this kind of concurrency problems ? Is there a > > better way ? > > Of course. The solution to this problem is inherent to a good database > and has therefore been done long long ago ;) I agree that a serial would be better. But I think there are situations where a serial isn't convenient Like when you want an primary key which consists of the current year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc. With a sequence you must write extra code to reset the sequence every year and you get into trouble if someone inserts data from the previous year... A 'MAX(id)+1' is much simpler and cleaner then. Ralph.
On Sun, Jan 23, 2005 at 10:23:50 +0100, Ralph van Etten <ralph@et10.org> wrote: > But I think there are situations where a serial isn't convenient > Like when you want an primary key which consists of the current > year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc. > With a sequence you must write extra code to reset the sequence every year > and you get into trouble if someone inserts data from the previous year... > A 'MAX(id)+1' is much simpler and cleaner then. You don't normally want to do that either. You should use one sequence for all of the years.
Ralph van Etten wrote: > I agree that a serial would be better. > > But I think there are situations where a serial isn't convenient > Like when you want an primary key which consists of the current > year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc. > With a sequence you must write extra code to reset the sequence every year > and you get into trouble if someone inserts data from the previous year... > A 'MAX(id)+1' is much simpler and cleaner then. My personal experience is trying to get primary keys to "mean" something is a pain in the ass. In your example, I'd much rather stick with serial as the primary key and store the year/sequence in another "display" field. Think about this situation: 1) records 5/1 to 5/2000 are loaded into a table 2) oops, you made a mistake ... another 200 records should have been shoehorned @ 5/20 3) ugh...you now have to add +200 to records 5/20 to 5/2000 -- and you have to do it 1 record at a time in reverse order. (trying to do an update via a single command will immediately produce a dup key violation --> 5/20+200 = 5/220 which already exists) 4) double ugh...you've got FK already pointed to those records -- now you have to drop the FKs first, update those tables, do step #3 again, recreate the FKs. On the otherhand, if you just used arbitary numbers, you could update 5/20 to 5/2000 with a single command, load the correct 5/20-5/220 records in and voila.
On Sun, 23 Jan 2005, William Yu wrote: > Ralph van Etten wrote: > > I agree that a serial would be better. > > > > But I think there are situations where a serial isn't convenient > > Like when you want an primary key which consists of the current > > year and an sequence number. Like ('05', 1), ('05', 2), ('05', 3) etc. > > With a sequence you must write extra code to reset the sequence every year > > and you get into trouble if someone inserts data from the previous year... > > A 'MAX(id)+1' is much simpler and cleaner then. > > My personal experience is trying to get primary keys to "mean" something > is a pain in the ass. In your example, I'd much rather stick with serial > as the primary key and store the year/sequence in another "display" > field. Think about this situation: Ok, but then I have the same problem. The "display" field has to be unique and it needs to use a SELECT MAX in an INSERT query and this gives the same problem. Only not with the primary key but with possible duplicate values in the "display" field. CREATE TABLE test ( pk SERIAL, disp1 CHAR(2), disp2 INT, UNIQUE(disp1, disp2), PRIMARY KEY(pk) ) and insert with something like: INSERT INTO test (disp1, disp2) SELECT 'XX', COALESCE(MAX(disp2)+1, 1) FROM test WHERE disp1 = 'XX'; This gives the same serialization errors. Ralph.
Ralph van Etten wrote: > Ok, but then I have the same problem. The "display" field has to be > unique and it needs to use a SELECT MAX in an INSERT query and this gives > the same problem. Only not with the primary key but with possible > duplicate values in the "display" field. If you absolutely need the display field to be unique and it has to be a sequence starting from 1 -- it's simplicity itself. Use the serialization errors to your advantage. If you try to insert a record and you get a dupe key violation, just increment your seq counter by 1 and try again. Keep trying until you succeed. No need to lock the tables at all. If the display field is just an arbitrary reference number that doesn't require 1..n, you've got even more ways of making it work. Make it a compound field consisting of something that's impossible do duplicate -- the server's process IDs. Since I primarily use perl, I myself use the perl process ID but I'm pretty sure you can get the connection's process ID from querying the information schema. Ie, postmaster pid 5673 can only run 1 statement at a time so if you insert 05-5673-15, you are guaranteed success.
On Mon, 24 Jan 2005, William Yu wrote: > If you absolutely need the display field to be unique and it has to be a > sequence starting from 1 -- it's simplicity itself. Use the > serialization errors to your advantage. If you try to insert a record > and you get a dupe key violation, just increment your seq counter by 1 > and try again. Keep trying until you succeed. No need to lock the tables > at all. Just trial and error ? I don't think its a very good solution. Then you have to figure out exactly which error it is and which column caused the violation. (How ? By parsing the error string ???) And if the correct error was returned you retry the query... But how many times ? Just once ? or a 1000 times ? I'm sorry but I think this solution is just a Q&D hack.