Thread: Table Lock issue
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, maybe my mind is stuck, but here's something strange. This is the classic "counter" thing, where you can't / won't use sequences. Basically I need to assemble an identifier like AAA-000012 where AAA- is the prefix and the number is sequencially counted up. The table is CREATE TABLE ib_counter ( name character varying(64) NOT NULL, prefix character varying(64) NOT NULL, last_value integer NOT NULL, display_length integer DEFAULT 0, renewal_prefix character varying(64), renewal_extension boolean, display_initially boolean, renewal_start integer ); I use a stored proc to get the next identifier: CREATE OR REPLACE FUNCTION ib_nextval (character varying) RETURNS character varying AS 'DECLARE countername ALIAS FOR $1; cprefix varchar; counter integer; dlen integer; complete varchar; format varchar; BEGIN LOCK TABLE ib_counter IN ACCESS EXCLUSIVE MODE; UPDATE ib_counter SET last_value=last_value+1 WHERE name=countername; SELECT INTO cprefix,counter,dlen prefix,last_value,display_length FROM ib_counter WHERE name=countername; format := ''FM''; FOR i IN 1..dlen LOOP format := format || ''0''; END LOOP; complete := cprefix || to_char(counter,format); RAISE NOTICE ''result is %,%,%,%'',complete,cprefix,counter,dlen; RETURN complete; END; Here the point. I can create duplicate identifiers. The stored proc is called within a transaction like (pseudocode) Begin value=ib_nextval('mycounter') do something with value commit My assumption would be that if I do an exclusive lock on the table I can't do the update or a second exclusive lock, so the stored proc should block (or fail). Obviously it doesn't work that way, since as said I get duplicates. Any ideas anyone ? Thanks UC - -- Open Source Solutions 4U, LLC 2570 Fleetwood Drive Phone: +1 650 872 2425 San Bruno, CA 94066 Cell: +1 650 302 2405 United States Fax: +1 650 872 2417 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAftXwjqGXBvRToM4RAiZEAKDRlceKo84vzQZ82iT4R45+gYPamgCfbQYT 9cqaTBxsn1aiPni9+X4j1MM= =2tXJ -----END PGP SIGNATURE-----
"Uwe C. Schroeder" <uwe@oss4u.com> writes: > I use a stored proc to get the next identifier: > CREATE OR REPLACE FUNCTION ib_nextval (character varying) RETURNS character > varying > ... > BEGIN > LOCK TABLE ib_counter IN ACCESS EXCLUSIVE MODE; > UPDATE ib_counter SET last_value=last_value+1 WHERE name=countername; > SELECT INTO cprefix,counter,dlen prefix,last_value,display_length FROM ib_counter WHERE name=countername; > My assumption would be that if I do an exclusive lock on the table I > can't do the update or a second exclusive lock, so the stored proc > should block (or fail). It does block, and it does do the update correctly (at least if you're not doing this in serializable mode). The problem is that the SELECT doesn't get the right result. The SELECT actually sees two row versions as being valid: the one you just created by UPDATE, and whichever one was current when the outer transaction started. But SELECT INTO will return at most one row, so it's roll-of-the-dice which one you get. You can avoid this by attaching FOR UPDATE to the SELECT. There have been discussions about this effect in the past (try searching the pghackers archives for mentions of SetQuerySnapshot). In this particular example it definitely seems like a bug, but if we fix it by performing SetQuerySnapshot between statements of a plpgsql function, we may break existing applications that aren't expecting that to happen. So far there's not been a consensus to change the behavior. BTW, I'd lose the LOCK if I were you; it doesn't do anything for you except prevent concurrent updates of different counters. The row lock obtained by the UPDATE is sufficient. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 15 April 2004 04:20 pm, Tom Lane wrote: > "Uwe C. Schroeder" <uwe@oss4u.com> writes: > > I use a stored proc to get the next identifier: > > > > CREATE OR REPLACE FUNCTION ib_nextval (character varying) RETURNS > > character varying > > ... > > BEGIN > > LOCK TABLE ib_counter IN ACCESS EXCLUSIVE MODE; > > UPDATE ib_counter SET last_value=last_value+1 WHERE > > name=countername; SELECT INTO cprefix,counter,dlen > > prefix,last_value,display_length FROM ib_counter WHERE name=countername; > > > > My assumption would be that if I do an exclusive lock on the table I > > can't do the update or a second exclusive lock, so the stored proc > > should block (or fail). > > It does block, and it does do the update correctly (at least if you're > not doing this in serializable mode). The problem is that the SELECT > doesn't get the right result. The SELECT actually sees two row versions > as being valid: the one you just created by UPDATE, and whichever one > was current when the outer transaction started. One question to "was current when the outer transaction started". Does that mean that if I have a long running transaction, all selects inside that transaction will only see what was committed as of the start of that transaction ? So if I do a "update xxx set ...." outside of the transaction, nothing inside the transaction will ever see that change, although it's committed ? This is "read committed" isolation level, where I would expect the selects inside the transaction see anything that is committed, not what WAS committed at the start of the transaction. > But SELECT INTO will > return at most one row, so it's roll-of-the-dice which one you get. > You can avoid this by attaching FOR UPDATE to the SELECT. > > There have been discussions about this effect in the past (try searching > the pghackers archives for mentions of SetQuerySnapshot). In this > particular example it definitely seems like a bug, but if we fix it by > performing SetQuerySnapshot between statements of a plpgsql function, > we may break existing applications that aren't expecting that to happen. > So far there's not been a consensus to change the behavior. > > BTW, I'd lose the LOCK if I were you; it doesn't do anything for you > except prevent concurrent updates of different counters. The row lock > obtained by the UPDATE is sufficient. > > regards, tom lane - -- UC - -- Open Source Solutions 4U, LLC 2570 Fleetwood Drive Phone: +1 650 872 2425 San Bruno, CA 94066 Cell: +1 650 302 2405 United States Fax: +1 650 872 2417 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAfyxsjqGXBvRToM4RAs0pAJ0cwAE/BdrLL/lq3Y2jBnmnW7rMFwCg0mXN 6EXDA/UH1kBRdnz0sm+NgSE= =hT3X -----END PGP SIGNATURE-----
"Uwe C. Schroeder" <uwe@oss4u.com> writes: > On Thursday 15 April 2004 04:20 pm, Tom Lane wrote: >> ... The problem is that the SELECT >> doesn't get the right result. The SELECT actually sees two row versions >> as being valid: the one you just created by UPDATE, and whichever one >> was current when the outer transaction started. > One question to "was current when the outer transaction started". I knew I wasn't going to get away with that explanation ;-). Okay, here's the long version. Let's imagine this situation: initially there's one row with last_value 1. Transaction A comes and does begin; select ib_nextval('mycounter'); but doesn't commit yet. Now there are two rows: one with last_value 1, which is marked as created by some past committed transaction, and as deleted by transaction A which is as yet uncommitted. There is also a row with last_value 2, which is marked as created by transaction A and deleted by nobody. Now transaction B comes and does select ib_nextval('mycounter'); It's going to block until A commits --- in your version, it blocks at the LOCK-table-exclusively command, in mine at the UPDATE because the UPDATE sees the row lock on the updated row. But in either case it waits. Once xact A commits, B proceeds to mark the row with last_value 2 as deleted by itself, and creates a row with last_value 3, created by itself and deleted by nobody. (This row will have last_value 3, not something less, because the UPDATE will use the latest available committed row as the starting point for its "last_value+1" computation.) Now we come to the tricky part: transaction B does its SELECT. Which of the three available rows will it consider valid? Because this SELECT is inside a function, and we don't advance the QuerySnapshot inside a function, the SELECT will be applying MVCC rules with respect to a snapshot that was taken when the outer "select ib_nextval()" began --- in other words, before transaction A committed. So the validity checks stack up like this: * original row with last_value 1: created by a long-since-committed transaction, and deleted by a transaction (xact A) that had not committed at the time of the snapshot. Ergo, good. * second row with last_value 2: created by a not-yet-committed xact (A) and deleted by my own transaction. Loses on either count; not good. * third row with last_value 3: created by my own transaction and deleted by nobody. Ergo, good. So both last_value 1 and last_value 3 are visible to the SELECT, and it's a crapshoot which will come up first in SELECT INTO. If we were to advance the QuerySnaphot between statements of a plpgsql function, the problem would go away because the SELECT would see transaction A as already committed, making the original row not-good. Now in this situation it is good to recognize the effects of other transactions between statements of a plpgsql function, but it's not hard to think up cases in which plpgsql functions would break if the visible database state changes between statements. So it's a bit of a tough choice what to do. I'm personally starting to think that we *should* advance the QuerySnapshot, but as I said there's not yet a consensus about it. Oh, one other point: SELECT FOR UPDATE fixes this because it has different visibility rules. Like UPDATE, it will *never* consider good a row version that is marked as deleted by any committed transaction. regards, tom lane
On Fri, 2004-04-16 at 00:17, Tom Lane wrote: > "Uwe C. Schroeder" <uwe@oss4u.com> writes: > > On Thursday 15 April 2004 04:20 pm, Tom Lane wrote: <snip most excellent summary of how this problem manifests itself> > > If we were to advance the QuerySnaphot between statements of a plpgsql > function, the problem would go away because the SELECT would see > transaction A as already committed, making the original row not-good. > > Now in this situation it is good to recognize the effects of other > transactions between statements of a plpgsql function, but it's not hard > to think up cases in which plpgsql functions would break if the visible > database state changes between statements. So it's a bit of a tough > choice what to do. I'm personally starting to think that we *should* > advance the QuerySnapshot, but as I said there's not yet a consensus > about it. > The problem is that, while the people for changing this behavior keep stacking up, theres no way to quantify how many people it would cause trouble for... really we're only going on the theory that it could cause trouble for people; i don't recall anyone posting a real world example that requires the current semantics. > Oh, one other point: SELECT FOR UPDATE fixes this because it has > different visibility rules. Like UPDATE, it will *never* consider good > a row version that is marked as deleted by any committed transaction. > The proposal to update the query snapshot inside plpgsql whenever a lock table in exclusive access is issued follows along this line and would seem like one way to help most people get around this problem since it's hard to imagine any real world scenario where one would want to lock a table exclusively and still see rows that are modified by other transactions. Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
Robert Treat <xzilla@users.sourceforge.net> writes: > On Fri, 2004-04-16 at 00:17, Tom Lane wrote: >> Now in this situation it is good to recognize the effects of other >> transactions between statements of a plpgsql function, but it's not hard >> to think up cases in which plpgsql functions would break if the visible >> database state changes between statements. So it's a bit of a tough >> choice what to do. I'm personally starting to think that we *should* >> advance the QuerySnapshot, but as I said there's not yet a consensus >> about it. > The problem is that, while the people for changing this behavior keep > stacking up, theres no way to quantify how many people it would cause > trouble for... really we're only going on the theory that it could cause > trouble for people; i don't recall anyone posting a real world example > that requires the current semantics. Well, that's because the people who need the current behavior (if any) haven't had reason to complain ;-). In general however it seems like one would expect a series of statements issued inside a plpgsql function to behave the same as if they were issued interactively (inside a transaction block of course). Right now that is true for SERIALIZABLE mode but not true for READ COMMITTED mode. So I'm starting to lean to the idea that we should change it and document it as a potential incompatibility. >> Oh, one other point: SELECT FOR UPDATE fixes this because it has >> different visibility rules. Like UPDATE, it will *never* consider good >> a row version that is marked as deleted by any committed transaction. > The proposal to update the query snapshot inside plpgsql whenever a lock > table in exclusive access is issued follows along this line and would > seem like one way to help most people get around this problem since it's > hard to imagine any real world scenario where one would want to lock a > table exclusively and still see rows that are modified by other > transactions. Hm, I don't recall having heard that proposal in this context, and I can't say that I like it. In the first place we don't really want to be encouraging people to lock tables exclusively, and in the second place I'm not sure plpgsql can detect all such cases (what if the LOCK is issued inside another function, or via EXECUTE?). In the third place it would create yet another subtle discrepancy in execution behavior between functions and interactive commands, which seems like the wrong direction to be going in. regards, tom lane