Thread: Do I need serializable for this query?
I have a table that keeps running totals. It is possible that this would get called twice simultaneously for the same UserID. Do I need to put this in a serializable transaction? Since this gets called often, could it be a performance problem if I make it serializable? CREATE FUNCTION UpdateTotals(IN UserID int, IN AddToCount int, IN AddToSize bigint) RETURNS void AS $$ BEGIN UPDATE Totals SET TotalSize = TotalSize + $2, TotalCount = TotalCount + $3 WHERE UserID = $1; END IF; END $$ LANGUAGE 'plpgsql' VOLATILE;
William Garrison wrote: > I have a table that keeps running totals. It is possible that this > would get called twice simultaneously for the same UserID. Do I need to > put this in a serializable transaction? Since this gets called often, > could it be a performance problem if I make it serializable? > > CREATE FUNCTION UpdateTotals(IN UserID int, > IN AddToCount int, > IN AddToSize bigint) > RETURNS void AS $$ > BEGIN > UPDATE > Totals > SET > TotalSize = TotalSize + $2, > TotalCount = TotalCount + $3 > WHERE > UserID = $1; > END IF; > END > $$ LANGUAGE 'plpgsql' VOLATILE; Where is that stray "END IF;" comming from? Anyway, this should be safe, and work even more reliably in read-committed mode than in serializable mode. In serializeable mode, if the Total of the same user is updated by two transactions simultanously, you'll get a SerializationError. Read-Committed mode OTOH will make sure that it uses the latest version of the tuple for calculating the new values. Usually, you need serializable mode if you do updates based on the results of previous selects. greetings, Florian Pflug
On Tue, 2007-04-10 at 14:45 -0400, William Garrison wrote: > I have a table that keeps running totals. It is possible that this > would get called twice simultaneously for the same UserID. Do I need to > put this in a serializable transaction? Since this gets called often, > could it be a performance problem if I make it serializable? > > CREATE FUNCTION UpdateTotals(IN UserID int, > IN AddToCount int, > IN AddToSize bigint) > RETURNS void AS $$ > BEGIN > UPDATE > Totals > SET > TotalSize = TotalSize + $2, > TotalCount = TotalCount + $3 > WHERE > UserID = $1; > END IF; > END > $$ LANGUAGE 'plpgsql' VOLATILE; > You only have one command to execute; you don't need to do anything special at all. If two UPDATEs happen to the same tuple at the same time, one will block waiting for the other to commit. Regards, Jeff Davis
William Garrison wrote: > I have a table that keeps running totals. It is possible that this > would get called twice simultaneously for the same UserID. Do I need > to put this in a serializable transaction? Transaction isolation is only a question of interest if you have more than one statement in a transaction. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Tue, Apr 10, 2007 at 10:52:11PM +0200, Peter Eisentraut wrote: > William Garrison wrote: > > I have a table that keeps running totals. It is possible that this > > would get called twice simultaneously for the same UserID. Do I need > > to put this in a serializable transaction? > > Transaction isolation is only a question of interest if you have more > than one statement in a transaction. Not true; the isolation level is also relevant if you're doing concurrent updates or deletes (although, as the documentation mentions, PostgreSQL's serializable isolation level doesn't guarantee true serializability). create table test (id integer primary key, t text); insert into test (id, t) values (1, 'a'); T1: begin isolation level serializable; T2: begin isolation level serializable; T1: update test set t = 'b' where id = 1; T2: update test set t = 'c' where id = 1; -- blocks T1: commit; T2: ERROR: could not serialize access due to concurrent update -- Michael Fuhr
This is part 2 of my "serializable" question from earlier. Thanks to everyone who answered the first part. (Hopefully there will be no part 3) When a user adds a record, I update a table with totals. But when a user deletes data, I subtract the totals. That presents a more interesting situation since it is theoretically possible to add a record and delete one at the same time: For simplicity sake, the two tables are something like: item(itemid int, data...) history(itemid int, versionid int, data...) -- This deletes the item and all the history records for the item CREATE FUNCTION DeleteItem(IN _UserID int, IN _ItemID int) RETURNS void AS $$ DECLARE mycount int; total bigint; BEGIN -- Cotal and total of data to be deleted SELECT COUNT(*), COALESCE(SUM(clientsize),0) INTO mycount,total FROM history WHERE userid= $1 AND itemid=$2; -- Potential problem point #1 -- Call the stored proc to update the count and total -- This is the one from my last email PERFORM sbp_UpdateTotals($1,-mycount,-total); -- Potential problem point #2 -- Delete the records DELETE FROM history WHERE userid = $1 AND itemid=$2; DELETE FROM items WHERE userid = $1 AND itemid=$2; END $$ LANGUAGE 'plpgsql' VOLATILE; Suppose the user adds another record to the history table with the same itemid as the one being deleted. If they do this at one of the two "problem points" then that history record will be deleted, but it will not be subtracted from the totals. Am I understanding this properly? In my system, this is highly unlikely, and probably not even something we care about very much. But I still want to know what to do. It seems to me I could a few things to fix this: 1) Make this serializable - I'm not even sure this will help, since adding a new history record won't change any records that this touches. Would PostgreSQL even realize that adding a history record would have changed the results of the select? 2) Table lock 3) Make the first select statement store the PK(itemid,versionid) of the history records and then only delete those records at the end. 4) Somehow calculate the count and total during the delete statement 5) Compare the # of records deleted with the # of records selected and throw/rollback if they mismatch. I'm curious which one I should do, and if my solutions are all valid. Also, I'm under the impression that I don't need to do any sort of begin/end transaction stuff in here, since PostgreSql does that implicitly with any stored procedure. Is that correct? Thanks to anyone who can assist. Florian G. Pflug wrote: > William Garrison wrote: >> I have a table that keeps running totals. It is possible that this >> would get called twice simultaneously for the same UserID. Do I need >> to put this in a serializable transaction? Since this gets called >> often, could it be a performance problem if I make it serializable? >> >> CREATE FUNCTION UpdateTotals(IN UserID int, >> IN AddToCount int, >> IN AddToSize bigint) >> RETURNS void AS $$ >> BEGIN >> UPDATE >> Totals >> SET >> TotalSize = TotalSize + $2, >> TotalCount = TotalCount + $3 >> WHERE >> UserID = $1; >> END IF; >> END >> $$ LANGUAGE 'plpgsql' VOLATILE; > > Where is that stray "END IF;" comming from? > Anyway, this should be safe, and work even more > reliably in read-committed mode than in serializable > mode. In serializeable mode, if the Total of the same > user is updated by two transactions simultanously, > you'll get a SerializationError. Read-Committed mode > OTOH will make sure that it uses the latest version of > the tuple for calculating the new values. > > Usually, you need serializable mode if you do updates > based on the results of previous selects. > > greetings, Florian Pflug >
Should I just use a trigger to update these totals? That way concurrent deletes/updates would be guaranteed to update the totals. The only down side is that deleting 10 records would result in 10 updates to the totals. But deleting is rare enough that it probably isn't a problem. William Garrison wrote: > This is part 2 of my "serializable" question from earlier. Thanks to > everyone who answered the first part. (Hopefully there will be no part 3) > > When a user adds a record, I update a table with totals. But when a > user deletes data, I subtract the totals. That presents a more > interesting situation since it is theoretically possible to add a record > and delete one at the same time: > > For simplicity sake, the two tables are something like: > item(itemid int, data...) > history(itemid int, versionid int, data...) > > -- This deletes the item and all the history records for the item > CREATE FUNCTION DeleteItem(IN _UserID int, > IN _ItemID int) > RETURNS void AS $$ > DECLARE > mycount int; > total bigint; > BEGIN > -- Cotal and total of data to be deleted > SELECT > COUNT(*), COALESCE(SUM(clientsize),0) > INTO > mycount,total > FROM > history > WHERE > userid= $1 AND itemid=$2; > > -- Potential problem point #1 > > -- Call the stored proc to update the count and total > -- This is the one from my last email > PERFORM sbp_UpdateTotals($1,-mycount,-total); > > -- Potential problem point #2 > > -- Delete the records > DELETE FROM history WHERE userid = $1 AND itemid=$2; > DELETE FROM items WHERE userid = $1 AND itemid=$2; > END > $$ LANGUAGE 'plpgsql' VOLATILE; > > Suppose the user adds another record to the history table with the same > itemid as the one being deleted. If they do this at one of the two > "problem points" then that history record will be deleted, but it will > not be subtracted from the totals. Am I understanding this properly? > > In my system, this is highly unlikely, and probably not even something > we care about very much. But I still want to know what to do. It seems > to me I could a few things to fix this: > > 1) Make this serializable > - I'm not even sure this will help, since adding a new history record > won't change any records that this touches. Would PostgreSQL even > realize that adding a history record would have changed the results of > the select? > 2) Table lock > 3) Make the first select statement store the PK(itemid,versionid) of the > history records and then only delete those records at the end. > 4) Somehow calculate the count and total during the delete statement > 5) Compare the # of records deleted with the # of records selected and > throw/rollback if they mismatch. > > I'm curious which one I should do, and if my solutions are all valid. > > Also, I'm under the impression that I don't need to do any sort of > begin/end transaction stuff in here, since PostgreSql does that > implicitly with any stored procedure. Is that correct? > > Thanks to anyone who can assist. > > Florian G. Pflug wrote: >> William Garrison wrote: >>> I have a table that keeps running totals. It is possible that this >>> would get called twice simultaneously for the same UserID. Do I need >>> to put this in a serializable transaction? Since this gets called >>> often, could it be a performance problem if I make it serializable? >>> >>> CREATE FUNCTION UpdateTotals(IN UserID int, >>> IN AddToCount int, >>> IN AddToSize bigint) >>> RETURNS void AS $$ >>> BEGIN >>> UPDATE >>> Totals >>> SET >>> TotalSize = TotalSize + $2, >>> TotalCount = TotalCount + $3 >>> WHERE >>> UserID = $1; >>> END IF; >>> END >>> $$ LANGUAGE 'plpgsql' VOLATILE; >> >> Where is that stray "END IF;" comming from? >> Anyway, this should be safe, and work even more >> reliably in read-committed mode than in serializable >> mode. In serializeable mode, if the Total of the same >> user is updated by two transactions simultanously, >> you'll get a SerializationError. Read-Committed mode >> OTOH will make sure that it uses the latest version of >> the tuple for calculating the new values. >> >> Usually, you need serializable mode if you do updates >> based on the results of previous selects. >> >> greetings, Florian Pflug >> > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >