Thread: functions, transactions, key violations
We have a function which has been throwing a key violation. We're tracking the purchase history for a limited number of unique items, and flagging the most recent purchase with purchase_status = 1. Stripped down, here's the schema and the function: CREATE TABLE purchases ( purchase_id SERIAL PRIMARY KEY, item_id INT NOT NULL, purchased_by int NOT NULL, purchase_price INT NOT NULL, purchased_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP, purchase_status INT NOT NULL DEFAULT 1 ); CREATE UNIQUE INDEX purchases_purchase_id_where_purchase_status_eq_1_key on purchases (purchase_id) WHERE purchase_status = 1; So, when a purchase is made for a given item, we also need to update the purchase_status of the previous purchase. CREATE OR REPLACE FUNCTION purchase(IN in_item_id integer, IN in_purchased_by bigint, IN in_purchase_price integer) RETURNS VOID AS $BODY$ BEGIN -- some selects UPDATE purchases SET purchase_status = 0 WHERE item_id = in_item_id AND purchase_status = 1; INSERT INTO purchases (item_id, purchased_by, purchase_price) VALUES (in_item_id, in_purchased_by, in_purchase_price); -- some more manipulation END $BODY$ LANGUAGE 'plpgsql' VOLATILE; We're getting errors that the INSERT INTO purchases is within the purchase function is violating the purchases_purchase_id_where_purchase_status_eq_1_key constraint. How can this be, if the function, called as a single statement (i.e., not within an explicit transaction) is its own transaction? The Postgres documentation shows another example, which leads me to believe I'm missing something. If a function does occur within a transaction, I don't understand why the exception block is necessary in Example 38-1. Exceptions with UPDATE/INSERT [1]. Given the table from the example: CREATE TABLE db (a INT PRIMARY KEY, b TEXT); If I issue the following transaction, BEGIN; UPDATE db SET b = data WHERE a = key; INSERT INTO db(a,b) VALUES (key, data); COMMIT; and the UPDATE didn't affect any rows, I'd expect the transaction to be successful. What am I missing? Thanks for your help. Michael Glaesemann grzm seespotcode net [1](http://www.postgresql.org/docs/8.3/interactive/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE )
Michael Glaesemann <grzm@seespotcode.net> writes: > We have a function which has been throwing a key violation. We're > tracking the purchase history for a limited number of unique items, > and flagging the most recent purchase with purchase_status = 1. Um, the function is ensuring that there is no conflicting row with the same item_id, but the unique constraint is on purchase_id. Plus, since purchase_id is declared PRIMARY KEY, the additional unique constraint is surely a no-op. I think you've anonymized the example into nonsense :-(. regards, tom lane
On Jun 4, 2008, at 5:39 PM, Tom Lane wrote: > I think you've anonymized the example into nonsense :-(. Now that I've provided yet another example of the perils of not providing the exact code, I've tried to patch it (below, for those of you willing to give me a second chance). However, my point regarding the example in the docs still holds. Why is the exception block necessary? Doesn't wrapping the statements in a function ensure the unique_violation couldn't occur? Again, this is from <http://www.postgresql.org/docs/8.3/interactive/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE > CREATE TABLE db (a INT PRIMARY KEY, b TEXT); CREATE FUNCTION merge_db(key INT, data TEXT) RETURNS VOID AS $$ BEGIN LOOP -- first try to update the key UPDATE db SET b = data WHERE a = key; IF found THEN RETURN; END IF; -- not there, so try to insert the key -- if someone else inserts the same key concurrently, -- we could get a unique-key failure BEGIN INSERT INTO db(a,b) VALUES (key, data); RETURN; EXCEPTION WHEN unique_violation THEN -- do nothing, and loop to try the UPDATE again END; END LOOP; END $$ LANGUAGE plpgsql; Michael Glaesemann grzm seespotcode net CREATE TABLE purchases ( purchase_id SERIAL PRIMARY KEY, item_id INT NOT NULL, purchased_by int NOT NULL, purchase_price INT NOT NULL, purchased_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP, purchase_status INT NOT NULL DEFAULT 1 ); CREATE UNIQUE INDEX purchases_purchase_id_where_purchase_status_eq_1_key on purchases (item_id) WHERE purchase_status = 1; CREATE OR REPLACE FUNCTION purchase(IN in_item_id integer, IN in_purchased_by bigint, IN in_purchase_price integer) RETURNS VOID AS $BODY$ BEGIN -- some selects UPDATE purchases SET purchase_status = 0 WHERE item_id = in_item_id AND purchase_status = 1; INSERT INTO purchases (item_id, purchased_by, purchase_price) VALUES (in_item_id, in_purchased_by, in_purchase_price); -- some more manipulation END $BODY$ LANGUAGE 'plpgsql' VOLATILE;
> CREATE OR REPLACE FUNCTION > purchase(IN in_item_id integer, > IN in_purchased_by bigint, > IN in_purchase_price integer) > RETURNS VOID AS > $BODY$ > BEGIN > -- some selects > UPDATE purchases > SET purchase_status = 0 > WHERE item_id = in_item_id > AND purchase_status = 1; > INSERT INTO purchases (item_id, purchased_by, purchase_price) > VALUES (in_item_id, in_purchased_by, in_purchase_price); > -- some more manipulation > END > $BODY$ > LANGUAGE 'plpgsql' VOLATILE; It looks like some concurrent statement could come between the UPDATE and the INSERT that can still cause a unique constraint violation. > The Postgres documentation shows another example, which leads me to > believe I'm missing something. If a function does occur within a > transaction, I don't understand why the exception block is necessary > in Example 38-1. Exceptions with UPDATE/INSERT [1]. Given the table > from the example: > [snip] > and the UPDATE didn't affect any rows, I'd expect the transaction to > be successful. > The comment from the example explains it: -- not there, so try to insert the key -- if someone else inserts the same key concurrently, -- we could get a unique-key failure The idea of the example is to try to update, and if it doesn't affect any rows, then insert. Some other concurrent transaction could still insert something after the UPDATE but before the INSERT, so the unique constraint violation can still occur. Regards, Jeff Davis
On Jun 4, 2008, at 6:00 PM, Jeff Davis wrote: >> Some other concurrent transaction could still > insert something after the UPDATE but before the INSERT, so the unique > constraint violation can still occur. Yes, I saw the comment. I'm guessing I'm missing something wrt transaction isolation level or locking. Would I need to use SERIALIZABLE or some kind of locking? Is the function in the example any different than the following explicit transaction (with the UPDATE not affecting any rows)? BEGIN; UPDATE db SET b = data WHERE a = key; INSERT INTO db(a,b) VALUES (key, data); COMMIT; I'm obviously on the cusp of learning something new, or understanding concurrency more deeply, so I'm looking forward to your responses. Cheers, Michael Glaesemann grzm seespotcode net
Michael Glaesemann <grzm@seespotcode.net> writes: > However, my point regarding the example in the docs still holds. Why > is the exception block necessary? Doesn't wrapping the statements in a > function ensure the unique_violation couldn't occur? Well, the point of that example is to deal correctly with the case where two sessions try to insert the same key concurrently. The second to reach the index insertion point will see the conflicting index entry already there, and it will block until it sees the other transaction commit or abort, and if commit it will then throw a unique exception. Are you certain that concurrent executions of purchase() for the same item_id can never happen in your system? Because that certainly seems the most likely explanation from here. regards, tom lane
On Wed, 2008-06-04 at 18:21 -0400, Michael Glaesemann wrote: > Yes, I saw the comment. I'm guessing I'm missing something wrt > transaction isolation level or locking. Would I need to use > SERIALIZABLE or some kind of locking? Is the function in the example > any different than the following explicit transaction (with the UPDATE > not affecting any rows)? I think the best way to explain this is with a timeline of two concurrent sessions, s1 and s2. s1: BEGIN; s2: BEGIN; s1: UPDATE db SET b = data WHERE a = key; -- matches no rows s2: UPDATE db SET b = data WHERE a = key; -- matches no rows s1: INSERT INTO db(a,b) VALUES (key, data); -- inserts with a = key s1: COMMIT; s2: INSERT INTO db(a,b) VALUES (key, data); -- unique violation! Notice that neither of the updates block, because neither match any rows, so there is no conflict. The exception handling in the loop in the example then retries s2 entirely, which then (correctly) updates the tuple rather than inserting. There's some degenerate case, I suppose, when sessions are perfectly synchronized with DELETEs such that it causes an infinite loop, but that's a pretty unrealistic scenario. SERIALIZABLE transactions don't really affect this, because the updates still don't match any rows. Serializable transactions really only affect the snapshot that you see and whether an UPDATE/DELETE causes a serialization error (which can only happen if they match some rows). The thing about a relation constraint (like UNIQUE) is that two completely separate tuples can conflict with each other. That requires a relation-level synchronization mechanism, because it can't assure that the constraint is satisfied by examining tuples (or any proper subsets of the relation) independently. The general way to implement a relation constraint is by using LOCK TABLE to prevent other concurrent sessions from interfering (as you suggest above). This obviously has very bad performance, which is why UNIQUE indexes provide another synchronization mechanism at the sub- transaction level. Regards, Jeff Davis
Apart from concurrency issues, it is possible that you have sequence generation problems. Depending on how you inserted the original rows into the 'purchases' table, it is possible that the nextval number has not kept-up and is lagging behind. You need to ensure that 'purchases_purchase_id_seq' is pointing to the correct next value! That is, if it is current nextval is number 100, but you already have 110 rows on the table (without gaps), it is no wonder you will receive primary key violations for the next 10 inserts but will work fine afterwards. ( Assuming, of course, you are the only one receiving values from the sequence.) Thanks Ioannis Tambouras