Thread: transaction confusion
I have a function that deletes all the values in a table and then inserts the data again. If this function is run in 2 sessions at the same time then it populates it twice, giving me a unique value error, because one of the fields is supposed to be unique. I understand this because the second time it is run, the function does not see the delete as the first transaction is still running, so it deletes the same data the first one deleted and does not know that the new data exists when it starts to delete. I need a table level lock to prevent this from happening. I have a second function that takes 3 time as long to run and also includes a delete and insert statement as well as a lot of other stuff. The weird thing is if this function is run in two separate sessions at the same time the data is not duplicated. There is no lock and I don't have a unique index on this table and when the process is finished running there is the same number of records whether I run it once or twice. This is how I would like it to run, but I have to know why it works, otherwise I'll be worried that it will bite me in a couple months because something surfaces that I didn't look at before. I have tested it in PGAdmin. The first staements of the function are delete and insert. When I run just the delete and insert at the same time, then it puts in double the records. When I run the function at the same time then it only puts in the records once. Its as if the function is waiting for the first function to completely finish before starting. The function that duplicates (and causes the error) is just a straight delete and insert. The one that doesn't includes a for loop on a select with an update statement (on the same table it is selecting from) in the loop AFTER the delete and insert statements. Does this make any sense?
On Sun, Sep 17, 2006 at 10:32:12AM +0200, Sim Zacks wrote: > I have a function that deletes all the values in a table and then > inserts the data again. If this function is run in 2 sessions at the > same time then it populates it twice, giving me a unique value error, > because one of the fields is supposed to be unique. <snip> > Does this make any sense? Not overly much to me anyway. Could you provide some cut-and-paste output from a psql session so we can see the actual commands you're typing. It's possible the actual deletes and inserts are coded in a way to cause a problem... Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
This function, when run in 2 separate sessions at the same time, causes a duplicate key error because popartid is a primary key. This makes sense to me. When the function is run in the second session it doesn't know about the transaction currently running in the first session. Therefore, when it does its delete, it obviously doesn't delete that data. When the first one finishes it puts the data into the table. When the second one finishes, it attempts to put its data into the table but finds records already there, which violate the primary key and cause an error. CREATE OR REPLACE FUNCTION populaterescheduleparts() RETURNS void AS $BODY$ begin delete from reschedulepoparts; insert into reschedulepoparts(popartid,priority,rescqty) SELECT a.popartid, a.priority, sum(b.rescqty) FROM reschedulepoparts_2 a JOIN reschedulepoparts_1 b ON a.popartid = b.popartid AND a.priority = b.priority group by a.popartid, a.priority; return; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE; The next one is a bit longer. I would expect that if it was run in 2 separate sessions that 1) it would generate the same error as the first one generated, because it also populates the same table and 2) that there would be twice the number of records in the stat_allocated_components table. However, no matter how many tests I run, this function always exits with the correct number of rows in the table and without error. CREATE OR REPLACE FUNCTION populate_allocated_components() RETURNS void AS $BODY$ declare statrec stat_allocated_components; crs refcursor; statpos stat_allocated_components; v_partid int; v_lfbused int8; v_lfused int8; v_lused int8; v_polf int8; v_pol int8; v_polfb int8; v_availableLF int8; v_availableL int8; v_availableLFB int8; v_balancel int8; v_balancelf int8; leftovers int8; futurel int8; futurelf int8; futurelfb int8; beforeupdate int8; v_firstneedl date; v_firstneedlf date; sumqty int8; rowresc record; leadfound bool; tqty int8; Begin update systemsettings set lastranallocated = now(); delete from stat_allocated_components; delete from reschedulepoparts; insert into stat_allocated_components(partid,quantity,assembliesbatchid,assemblyname, popartid,duedate,stock,leadfree,l,lf,lfb,lbp,leadstateid) SELECT a.partid, a.quantity, c.assembliesbatchid, d.assemblyname,0, CASE WHEN (c.entrydate + '49 days'::interval) < c.lastmodified THEN c.lastmodified ELSE c.entrydate + '49 days'::interval END::timestamp with time zone AS duedate, case when d.leadfree then coalesce(g.totallf,0) else coalesce(g.totallead,0) end AS stock, d.leadfree,coalesce(g.l,0),coalesce(g.lf,0),coalesce(g.lfb,0),coalesce(g.lbp,0), case when d.leadfree then 1 else 2 end FROM assemblies d JOIN assembliesbatch c ON d.assemblyid = c.assemblyid JOIN allocatedassemblies a ON a.assembliesbatchid = c.assembliesbatchid left Join stockperownerandleadstate g on g.partid=a.partid and g.ownerid=1 WHERE c.assembliesbatchstatusid in (1,2,4,7) and a.quantity<0; insert into stat_allocated_components(partid,quantity,assembliesbatchid,assemblyname,popartid,duedate,l,lf,lfb,lbp,leadstateid,postatusid) SELECT a.partid, a.quantity, a.popartid AS assembliesbatchid, a.poref AS assemblyname, a.popartid AS popartid, a.expecteddate AS duedate, coalesce(g.l,0),coalesce(g.lf,0),coalesce(g.lfb,0),coalesce(g.lbp,0),a.leadstateid,postatusid FROM expectedpodelivery a Left join stockperownerandleadstate g on g.partid=a.partid and g.ownerid=1 ; v_partid=-1; for statrec in select * from stat_allocated_components order by partid,duedate,assembliesbatchid LOOP if v_partid<>statrec.partid then v_partid=statrec.partid; v_lfbused:=0; v_lfused:=0; v_lused:=0; v_polf:=0; v_pol:=0; v_polfb:=0; v_balancel:=0; v_balancelf:=0; v_firstneedl:=null; v_firstneedlf:=null; tqty:=0; end if; if statrec.quantity<0 then v_balancel:=statrec.stock -(v_lused+v_lfbused)+v_pol + v_polfb+case when not statrec.leadfree then statrec.quantity else 0 end; v_balancelf:=statrec.stock -(v_lfused+v_lfbused)+v_polf + v_polfb+case when statrec.leadfree then statrec.quantity else 0 end; if v_balancel<0 and v_firstneedl is null then v_firstneedl:=statrec.duedate; end if; if v_balancelf<0 and v_firstneedlf is null then v_firstneedlf:=statrec.duedate; end if; update stat_allocated_components set previouscommitmentlf=v_lfused+v_lfbused,previouscommitmentl=v_lused+v_lfbused,previouspol=v_pol,previouspolf=v_polf,previouspolfb=v_polfb, balance=case when statrec.leadfree then v_balancelf else v_balancel end, instock=case when statrec.leadfree then v_balancelf else v_balancel end>0 where id=statrec.id; v_availableLF:=statrec.lf+v_polf-v_lfused; v_availableL:=statrec.l+statrec.lbp+v_pol-v_lused; v_availableLFB:=statrec.lfb+v_polfb-v_lfbused; v_lfbused:=v_lfbused+lesserof(abs(statrec.quantity) - lesserof(case when statrec.leadfree then greaterof(v_availablelf,0) else greaterof(v_availablel,0) end,abs(statrec.quantity)),greaterof(v_availablelfb,0)); if statrec.leadfree then v_lfused:=v_lfused+lesserof(abs(statrec.quantity) - lesserof(abs(statrec.quantity) - lesserof(greaterof(v_availablelf,0),abs(statrec.quantity)),greaterof(v_availablelfb,0)),greaterof(v_availablelf,0)); leftovers:= abs(statrec.quantity)- (lesserof(abs(statrec.quantity) - lesserof(abs(statrec.quantity) - lesserof(greaterof(v_availablelf,0),abs(statrec.quantity)), greaterof(v_availablelfb,0)), greaterof(v_availablelf,0)) + lesserof(abs(statrec.quantity) - lesserof(greaterof(v_availablelf,0),abs(statrec.quantity)), greaterof(v_availablelfb,0))); else v_lused:=v_lused+lesserof(abs(statrec.quantity) - lesserof(abs(statrec.quantity) - lesserof(greaterof(v_availablel,0),abs(statrec.quantity)),greaterof(v_availablelfb,0)),greaterof(v_availablel,0)); leftovers:= abs(statrec.quantity)- (lesserof(abs(statrec.quantity) - lesserof(abs(statrec.quantity) - lesserof(greaterof(v_availablel,0),abs(statrec.quantity)), greaterof(v_availablelfb,0)), greaterof(v_availablel,0)) + lesserof(abs(statrec.quantity) - lesserof(greaterof(v_availablel,0),abs(statrec.quantity)), greaterof(v_availablelfb,0))); end if; if leftovers>0 then futurelfb:=0; futurelf:=0; futurel:=0; if crs is not null then close crs; end if; open crs for select * from stat_allocated_components where quantity>0 and partid=statrec.partid and (duedate>statrec.duedate or (duedate=statrec.duedate and assembliesbatchid>statrec.assembliesbatchid)) order by duedate,assembliesbatchid; fetch crs into statpos; While found LOOP if statpos.leadstateid =1 and statrec.leadfree then futurelf:=futurelf+statpos.quantity; beforeupdate:=v_lfused; v_lfused:=v_lfused + lesserof(leftovers,futurelf+v_polf+statrec.lf-v_lfused); leftovers:=leftovers-lesserof(leftovers,futurelf+v_polf+statrec.lf-beforeupdate); elsif statpos.leadstateid = 2 and not statrec.leadfree then futurel:=futurel+statpos.quantity; beforeupdate:=v_lused; v_lused:=v_lused + lesserof(leftovers,futurel+v_pol+statrec.l-v_lused); leftovers:=leftovers-lesserof(leftovers,futurel+v_pol+statrec.l-beforeupdate); elsif statpos.leadstateid in (3,4) then futurelfb:=futurelfb+statpos.quantity; beforeupdate:=v_lfbused; v_lfbused:=v_lfbused + lesserof(leftovers,futurelfb+v_polfb+statrec.lfb-v_lfbused); leftovers:=leftovers-lesserof(leftovers,futurelfb+v_polfb+statrec.lfb-beforeupdate); end if; if leftovers=0 then exit; end if; fetch crs into statpos; end LOOP; if leftovers<>0 then if statrec.leadfree then v_lfused:=v_lfused+leftovers; else v_lused:=v_lused+leftovers; end if; end if; end if; else sumqty:=0; if statrec.leadstateid = 2 then v_pol:=v_pol+statrec.quantity; if v_balancel<0 then insert into reschedulepoparts(popartid,priority,rescqty) values(statrec.assembliesbatchid,v_firstneedl,lesserof(-v_balancel,statrec.quantity)); if lesserof(-v_balancel,statrec.quantity)=-v_balancel then v_firstneedl:=null; else for rowresc in select duedate,leadfree,balance from stat_allocated_components where not leadfree and duedate>=v_firstneedl and balance<0 and duedate<=statrec.duedate and partid=v_partid order by duedate,assembliesbatchid Loop sumqty:=-rowresc.balance; if sumqty>statrec.quantity then v_firstneedl:=rowresc.duedate; exit; end if; End Loop; end if; v_balancel:=v_balancel+statrec.quantity; end if; elsif statrec.leadstateid=1 then v_polf:=v_polf+statrec.quantity; if v_balancelf<0 then insert into reschedulepoparts(popartid,priority,rescqty) values(statrec.assembliesbatchid,v_firstneedlf,lesserof(-v_balancelf,statrec.quantity)); if lesserof(-v_balancelf,statrec.quantity)=-v_balancelf then v_firstneedlf:=null; else for rowresc in select duedate,leadfree,balance from stat_allocated_components where leadfree and duedate>=v_firstneedlf and balance<0 and duedate<=statrec.duedate and partid=v_partid order by duedate,assembliesbatchid Loop sumqty:=-rowresc.balance; if sumqty>statrec.quantity then v_firstneedlf:=rowresc.duedate; exit; end if; End Loop; end if; v_balancelf:=v_balancelf+statrec.quantity; end if; elsif statrec.leadstateid in (3,4) then v_polfb:=v_polfb+statrec.quantity; if v_balancelf<0 or v_balancel<0 then insert into reschedulepoparts(popartid,priority,rescqty) values(statrec.assembliesbatchid,lesserof(v_firstneedlf,v_firstneedl),lesserof(-(lesserof(v_balancelf,v_balancel)),statrec.quantity)); if lesserof(-(lesserof(v_balancelf,v_balancel)),statrec.quantity)=-(lesserof(v_balancelf,v_balancel)) then v_firstneedlf:=null; v_firstneedl:=null; else leadfound:=null; for rowresc in select duedate,leadfree,balance from stat_allocated_components where duedate>=v_firstneedlf and balance<0 and duedate<=statrec.duedate and partid=v_partid order by duedate,assembliesbatchid Loop sumqty:=-rowresc.balance; if leadfound is null then if sumqty>statrec.quantity then if rowresc.leadfree then v_firstneedlf:=rowresc.duedate; v_firstneedl:=null; leadfound:=false; else v_firstneedl:=rowresc.duedate; v_firstneedlf:=null; leadfound:=true; end if; end if; else if leadfound and rowresc.leadfree then v_firstneedlf:=rowresc.duedate; exit; elsif not leadfound and not rowresc.leadfree then v_firstneedl:=rowresc.duedate; exit; end if; end if; End Loop; end if; v_balancel:=v_balancel+statrec.quantity; v_balancelf:=v_balancelf+statrec.quantity; end if; end if; end if; End Loop; return; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE; Martijn van Oosterhout wrote: > On Sun, Sep 17, 2006 at 10:32:12AM +0200, Sim Zacks wrote: >> I have a function that deletes all the values in a table and then >> inserts the data again. If this function is run in 2 sessions at the >> same time then it populates it twice, giving me a unique value error, >> because one of the fields is supposed to be unique. > > <snip> > >> Does this make any sense? > > Not overly much to me anyway. Could you provide some cut-and-paste > output from a psql session so we can see the actual commands you're > typing. It's possible the actual deletes and inserts are coded in a way > to cause a problem... > > Hope this helps,
Sim Zacks <sim@compulab.co.il> writes: > This function, when run in 2 separate sessions at the same time, > causes a duplicate key error because popartid is a primary key. > ... > The next one is a bit longer. I would expect that if it was run in 2 > separate sessions that 1) it would generate the same error as the > first one generated, I would expect that too, assuming that it actually inserts the same set of rows into the table as the first one does (how sure are you of that?). I wonder whether you are testing them both under identical conditions. In particular, is the initial DELETE really doing anything or is the table usually empty to start with anyway? If there is something to delete then that ought to serialize the two sessions, leading to no error (because the second guy in will wait to see if the first guy commits his deletion). [ eyeballs second function some more... ] Actually, the second function is guaranteed to be serialized by that initial "update systemsettings" --- I assume that's a one-row table? The second guy in will be unable to get past that until the first guy commits, and then he'll see the first guy's updates and there will be no error. I think the reason you are seeing failures in the first function is that the initial DELETE is a no-op so it doesn't serialize anything, and then there is conflict when the two INSERTs proceed in parallel. regards, tom lane
> I think the reason you are seeing failures in the first function is > that the initial DELETE is a no-op so it doesn't serialize anything, > and then there is conflict when the two INSERTs proceed in parallel. There is always data in the table that it deletes. So, according to that, with the first function the second process should be running serially and for some reason it is running in parallel. I just ran the function and there are now 61 rows in the table. I ran it again and there are still 61 rows in the table. I ran them both at the same time and: ERROR: duplicate key violates unique constraint "reschedulepoparts1_pkey" SQL state: 23505 Context: SQL statement "insert into reschedulepoparts(popartid,priority,rescqty) SELECT a.popartid, a.priority, sum(b.rescqty) FROM reschedulepoparts_2 a JOIN reschedulepoparts_1 b ON a.popartid = b.popartid AND a.priority = b.priority group by a.popartid, a.priority" PL/pgSQL function "populaterescheduleparts" line 3 at SQL statement My database is postgresql 8.0.4 running on gentoo. Is there anything else I can do to give you debug information? > [ eyeballs second function some more... ] Actually, the second > function is guaranteed to be serialized by that initial > "update systemsettings" --- I assume that's a one-row table? The second > guy in will be unable to get past that until the first guy commits, and > then he'll see the first guy's updates and there will be no error. Now I understand why the second one does not cause an error, because of the update statement. (Yes, it is a one record table). The question is why the first function is not running serially. Tom Lane wrote: > Sim Zacks <sim@compulab.co.il> writes: >> This function, when run in 2 separate sessions at the same time, >> causes a duplicate key error because popartid is a primary key. >> ... >> The next one is a bit longer. I would expect that if it was run in 2 >> separate sessions that 1) it would generate the same error as the >> first one generated, > > I would expect that too, assuming that it actually inserts the same set > of rows into the table as the first one does (how sure are you of that?). > I wonder whether you are testing them both under identical conditions. > In particular, is the initial DELETE really doing anything or is the > table usually empty to start with anyway? If there is something to > delete then that ought to serialize the two sessions, leading to no > error (because the second guy in will wait to see if the first guy > commits his deletion). > > [ eyeballs second function some more... ] Actually, the second > function is guaranteed to be serialized by that initial > "update systemsettings" --- I assume that's a one-row table? The second > guy in will be unable to get past that until the first guy commits, and > then he'll see the first guy's updates and there will be no error. > I think the reason you are seeing failures in the first function is > that the initial DELETE is a no-op so it doesn't serialize anything, > and then there is conflict when the two INSERTs proceed in parallel. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
> I think the reason you are seeing failures in the first function is > that the initial DELETE is a no-op so it doesn't serialize anything, > and then there is conflict when the two INSERTs proceed in parallel. Here is a simple, reproducible example that delete doesn't cause it to use serial: create table testserial(id serial primary key, val int); create or replace function inserttest()returns int as $$ begin delete from testserial; for i in 1..100000 loop insert into testserial(val) values(i); end Loop; return 1; end; $$ language 'plpgsql'; select inserttest(); Now there are 100,000 records in the table. Run the function from 2 different sessions at the same time and you will see that there are 200,000 records in the table and not 100,000 records. I also tested with an update statement: create or replace function inserttest()returns int as $$ begin update testserial set val=5 where val=1; delete from testserial; for i in 1..100000 loop insert into testserial(val) values(i); end Loop; return 1; end; $$ language 'plpgsql'; When this function is run twice at the same time, it actually does run in serial and there is only 100,000 records in the table. Tom Lane wrote:
On Mon, Sep 18, 2006 at 10:31:26AM +0200, Sim Zacks wrote: > > I think the reason you are seeing failures in the first function is > > that the initial DELETE is a no-op so it doesn't serialize anything, > > and then there is conflict when the two INSERTs proceed in parallel. > > Here is a simple, reproducible example that delete doesn't cause it to use > serial: I wonder if you set the transaction mode to "serializable" whether that would make a difference. In standard read-committed the way it works below seems to be what's expected (each transaction sees what was committed at the time is ran). Have a ncie day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
> I wonder if you set the transaction mode to "serializable" whether that > would make a difference. In standard read-committed the way it works > below seems to be what's expected (each transaction sees what was > committed at the time is ran). > I tried running this in both windows at the same time (this is the inserttest() without the update statment at the beginning): begin; set transaction ISOLATION LEVEL SERIALIZABLE; select inserttest(); commit; and I got this error: ERROR: could not serialize access due to concurrent update SQL state: 40001 Context: SQL statement "delete from testserial" PL/pgSQL function "inserttest" line 2 at SQL statement
Sim Zacks <sim@compulab.co.il> writes: > Here is a simple, reproducible example that delete doesn't cause it to use serial: If you add raise notice 'past the delete'; between the DELETE and the insert loop, you will discover that the second guy in fact doesn't manage to complete the DELETE until the first one commits. He's finding the first deletable row, noticing that the first guy has a delete in progress on it, and waiting till the first guy commits. He then finishes scanning the table, but finds nothing he can delete --- every tuple visible to his snapshot is already committed deleted, and the rows added by the first guy aren't visible. Then when he gets to the next command, he takes a new snapshot, and suddenly the first guy's insertions *are* visible. Basically the reason that your complicated function works is that the update on the single-table row creates a serialization point before the snapshot for the DELETE is taken, while the simpler function serializes inside the DELETE --- after that snap is taken. regards, tom lane