Thread: Strang behaviour SELECT ... LIMIT n FOR UPDATE
Hi, I'm facing a strange behaviour with a statement SELECT ... LIMIT n FOR UPDATE in PostgreSQL 8.1. The number of rows returned is actually (n - 1). I'm trying to find whether this is an identified issue with PostgreSQL 8.1 that might have been fixed in a later version such as 8.2; I don't have any problem in moving to a later version if needed. agoratokens=> SELECT * FROM "Tokens" INNER JOIN "Tokentypes" ON "Tokens"."type"="Tokentypes"."type" WHERE "Tokentypes"."tokenName" ='clanName' AND "Tokens"."isLocked" = false limit 2 FOR UPDATE; id | type | value | isLocked | timestamp | type | tokenName -----+------+--------------------------+----------+--------------------- -------+------+----------- 104 | 2 | RegressionTestClanName13 | f | 2007-11-27 20:40:25.208074 | 2 | clanName (1 row) agoratokens=> SELECT * FROM "Tokens" INNER JOIN "Tokentypes" ON "Tokens"."type"="Tokentypes"."type" WHERE "Tokentypes"."tokenName" ='clanName' AND "Tokens"."isLocked" = false limit 3 FOR UPDATE; id | type | value | isLocked | timestamp | type | tokenName -----+------+--------------------------+----------+--------------------- -------+------+----------- 104 | 2 | RegressionTestClanName13 | f | 2007-11-27 20:40:25.208074 | 2 | clanName 118 | 2 | RegressionTestClanName28 | f | 2007-11-21 21:10:29.872352 | 2 | clanName (2 rows) If I remove the FOR UPDATE clause, the SELECT ... LIMIT n statement returns n rows as expected: agoratokens=> SELECT * FROM "Tokens" INNER JOIN "Tokentypes" ON "Tokens"."type"="Tokentypes"."type" WHERE "Tokentypes"."tokenName" ='clanName' AND "Tokens"."isLocked" = false limit 3; id | type | value | isLocked | timestamp | type | tokenName -----+------+--------------------------+----------+--------------------- -------+------+----------- 104 | 2 | RegressionTestClanName13 | f | 2007-11-27 20:40:25.208074 | 2 | clanName 40 | 2 | RegressionTestClanName9 | f | 2007-10-15 11:27:31.897 | 2 | clanName 118 | 2 | RegressionTestClanName28 | f | 2007-11-21 21:10:29.872352 | 2 | clanName (3 rows) -- Daniel
"Daniel Caune" <daniel.caune@ubisoft.com> writes: > I'm facing a strange behaviour with a statement SELECT ... LIMIT n FOR > UPDATE in PostgreSQL 8.1. The number of rows returned is actually (n - > 1). I'm trying to find whether this is an identified issue with > PostgreSQL 8.1 that might have been fixed in a later version such as > 8.2; I don't have any problem in moving to a later version if needed. There's no known issue specifically of that form (and a quick test of 8.1 doesn't reproduce any such behavior). However, it is known and documented that LIMIT and FOR UPDATE behave rather oddly together: the LIMIT is applied first, which means that if FOR UPDATE rejects any rows as being no longer up-to-date, you get fewer than the expected number of rows out. You did not mention any concurrent activity in your example, but I'm betting there was some ... regards, tom lane
Tom Lane wrote: > "Daniel Caune" <daniel.caune@ubisoft.com> writes: > > I'm facing a strange behaviour with a statement SELECT ... LIMIT n FOR > > UPDATE in PostgreSQL 8.1. The number of rows returned is actually (n - > > 1). I'm trying to find whether this is an identified issue with > > PostgreSQL 8.1 that might have been fixed in a later version such as > > 8.2; I don't have any problem in moving to a later version if needed. > > There's no known issue specifically of that form (and a quick test of > 8.1 doesn't reproduce any such behavior). However, it is known and > documented that LIMIT and FOR UPDATE behave rather oddly together: > the LIMIT is applied first, which means that if FOR UPDATE rejects > any rows as being no longer up-to-date, you get fewer than the expected > number of rows out. You did not mention any concurrent activity in > your example, but I'm betting there was some ... Current documentation explains why in the SELECT manual page: It is possible for a <command>SELECT</> command using both <literal>LIMIT</literal> and <literal>FOR UPDATE/SHARE</literal> clauses to return fewer rows than specified by <literal>LIMIT</literal>. This is because <literal>LIMIT</>is applied first. The command selects the specified number of rows, but might then block trying toobtain lock on one or more of them. Once the <literal>SELECT</> unblocks, the row might have been deleted or updated so that it does not meet the query <literal>WHERE</> condition anymore, in which case it will not be returned. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
> -----Message d'origine----- > De : Tom Lane [mailto:tgl@sss.pgh.pa.us] > Envoyé : mardi, novembre 27, 2007 23:46 > À : Daniel Caune > Cc : pgsql-sql@postgresql.org > Objet : Re: [SQL] Strang behaviour SELECT ... LIMIT n FOR UPDATE > > "Daniel Caune" <daniel.caune@ubisoft.com> writes: > > I'm facing a strange behaviour with a statement SELECT ... LIMIT n FOR > > UPDATE in PostgreSQL 8.1. The number of rows returned is actually (n - > > 1). I'm trying to find whether this is an identified issue with > > PostgreSQL 8.1 that might have been fixed in a later version such as > > 8.2; I don't have any problem in moving to a later version if needed. > > There's no known issue specifically of that form (and a quick test of > 8.1 doesn't reproduce any such behavior). However, it is known and > documented that LIMIT and FOR UPDATE behave rather oddly together: > the LIMIT is applied first, which means that if FOR UPDATE rejects > any rows as being no longer up-to-date, you get fewer than the expected > number of rows out. You did not mention any concurrent activity in > your example, but I'm betting there was some ... > > regards, tom lane Yes, you were betting right. However I would have thought that the SELECT ... FOR UPDATE statement blocks if another processwere locking the same rows. The record values don't change from a call to another. I did read the documentation, especially the section that Bruce Momjian'spointed me out, but I don't think that it corresponds to this case (cf. my test). I did the following test, removing all the where-clause from the SELECT statement. Every statement completes immediately,i.e. it doesn't block. agoratokens=> select id from "Tokens" id ----- 47104 44 42 33 69 94 89 90... Time: 119.314 ms agoratokens=> select id from "Tokens" limit 2 for update;id ----- 47104 (2 rows) Time: 17.679 ms agoratokens=> select id from "Tokens" limit 3 for update;id ----- 47104 (2 rows) Time: 20.452 ms The statement doesn't return the row where id equals to 44. agoratokens=> select id from "Tokens" limit 3;id ----- 47104 44 (3 rows) Time: 1.186 ms The statement returns the row where id equals to 44. agoratokens=> select id from "Tokens" limit 3 for update;id ----- 47104 (2 rows) Time: 9.473 ms The statement still doesn't return the row where id equals to 44. agoratokens=> select id from "Tokens" where id IN (47, 104, 44, 42) limit 3 for update;id ----- 47104 44 (3 rows) This time, the statement returns the row where id equals to 44. agoratokens=> select id from "Tokens" limit 3;id ----- 47104 44 (3 rows) Time: 7.547 ms agoratokens=> select id from "Tokens" limit 5 for update;id ----- 47104 33 (3 rows) Time: 11.725 ms This time, the statement doesn't return the rows where id equals to 44 and 42. agoratokens=> select id from "Tokens" limit 8 for update;id ----- 47104 33 69 94 89 (6 rows) Time: 11.794 ms The statement still doesn't return the rows where id equals to 44 a 42. agoratokens=> select id from "Tokens" where id = 44 limit 3 for update;id ----44 (1 row) Time: 14.172 ms The statement does return the row where id equals to 44. "However, it is known and documented that LIMIT and FOR UPDATE behave rather oddly together: the LIMIT is applied first,which means that if FOR UPDATE rejects any rows as being no longer up-to-date, you get fewer than the expected numberof rows out." Tom, when you say "rows as being no longer up-to-date", do you mean which values don't match anymore the where-clauses ofthe SELECT statement? If so, that doesn't correspond to my test since I remove every where-clause. Any ideas, any other tests I can try? Thanks, -- Daniel
"Daniel Caune" <daniel.caune@ubisoft.com> writes: > I did the following test, removing all the where-clause from the SELECT statement. Every statement completes immediately,i.e. it doesn't block. I think you left out some critical information, like who else was doing what to the table. What it looks like to me is that the third and fourth rows in this view were live according to your transaction snapshot, but were committed dead as of current time, and so FOR UPDATE wouldn't return them. > agoratokens=> select id from "Tokens" where id IN (47, 104, 44, 42) limit 3 for update; > This time, the statement returns the row where id equals to 44. No, it returns *some* row where id equals 44. Not necessarily the same one seen in the seqscan. (I imagine this query is using an index, and so would visit rows in a different physical order.) Printing the ctid of the rows would confirm or disprove that theory. regards, tom lane
> De : Tom Lane [mailto:tgl@sss.pgh.pa.us] > > "Daniel Caune" <daniel.caune@ubisoft.com> writes: > > I did the following test, removing all the where-clause from the SELECT > statement. Every statement completes immediately, i.e. it doesn't block. > > I think you left out some critical information, like who else was doing > what to the table. > > What it looks like to me is that the third and fourth rows in this view > were live according to your transaction snapshot, but were committed > dead as of current time, and so FOR UPDATE wouldn't return them. > > > agoratokens=> select id from "Tokens" where id IN (47, 104, 44, 42) > limit 3 for update; > > This time, the statement returns the row where id equals to 44. > > No, it returns *some* row where id equals 44. Not necessarily the same > one seen in the seqscan. (I imagine this query is using an index, and > so would visit rows in a different physical order.) Printing the ctid > of the rows would confirm or disprove that theory. > > regards, tom lane Thanks Tom. I think this time you will point me out the problem. The column id has a primary key constraint on. Thereshould not be more than one row with id equals to 44. agoratokens=> \d "Tokens" Table "public.Tokens" Column | Type | Modifiers -----------+--------------------------------+-------------------------------------------------------id | integer | not null default nextval('"Tokens_id_seq"'::regclass)type | integer |not nullvalue | character varying(255) | not nullisLocked | boolean | not null defaultfalsetimestamp | timestamp(6) without time zone | Indexes: "Tokens_pkey" PRIMARY KEY, btree (id) (...) agoratokens=> select ctid, * from "Tokens" where "isLocked" = true limit 3 for update;ctid | id | type | value | isLocked| timestamp ------+----+------+-------+----------+----------- (0 rows) agoratokens=> select ctid, * from "Tokens" where "isLocked" = true limit 3; ctid | id | type | value | isLocked | timestamp -----------+----+------+-------+----------+--------------------------- (199,84) | 44 | 3 | 3 | t | 2007-04-0312:12:02.46944(199,114) | 42 | 3 | 1 | t | 2007-04-03 13:00:44.877 (2 rows) agoratokens=> select ctid, * from "Tokens" where id = 44; ctid | id | type | value | isLocked | timestamp -----------+----+------+-------+----------+----------------------------(3702,85) | 44 | 3 | 3 | f | 2007-11-2216:41:33.494371 (1 row) agoratokens=> select count(*) from "Tokens" where id = 44;count ------- 1 (1 row) It seems that, in certain condition, row (199,84) is shadowing row (3702,85); my feeling from a "customer" high level. Indeed,as a PostgreSQL core developer, that assertion could make you laugh... :-) I took into account your point about the concurrent context. Therefore I isolated the database from any connection exceptmine. # TYPE DATABASE USER IP-ADDRESS IP-MASK METHOD local all all trust host all all 127.0.0.1 255.255.255.0 password #host all all 10.3.41.0 255.255.254.0 password sudo /etc/init.d/postgresql-8.1 restart* Restarting PostgreSQL 8.1 database server [ ok ] No other client than my psql was connected to PostgreSQL. You can trust me. The result is exactly the same: agoratokens=> select ctid, * from "Tokens" where "isLocked" = true limit 3; ctid | id | type | value | isLocked | timestamp -----------+----+------+-------+----------+--------------------------- (199,84) | 44 | 3 | 3 | t | 2007-04-0312:12:02.46944(199,114) | 42 | 3 | 1 | t | 2007-04-03 13:00:44.877 (2 rows) agoratokens=> select ctid, * from "Tokens" where id = 44; ctid | id | type | value | isLocked | timestamp -----------+----+------+-------+----------+----------------------------(3702,85) | 44 | 3 | 3 | f | 2007-11-2216:41:33.494371 (1 row) agoratokens=> select ctid, * from "Tokens" where "isLocked" = true limit 3; ctid | id | type | value | isLocked | timestamp -----------+----+------+-------+----------+--------------------------- (199,84) | 44 | 3 | 3 | t | 2007-04-0312:12:02.46944(199,114) | 42 | 3 | 1 | t | 2007-04-03 13:00:44.877 (2 rows) agoratokens=> select ctid, * from "Tokens" where id = 44; ctid | id | type | value | isLocked | timestamp -----------+----+------+-------+----------+----------------------------(3702,85) | 44 | 3 | 3 | f | 2007-11-2216:41:33.494371 (1 row) agoratokens=> select count(*) from "Tokens" where id = 44;count ------- 1 (1 row) By the way, according to the "business logic", the timestamp "2007-04-03 12:12:02.46944" is weird, because too old. I apologizeif my question is stupid because of my knowledge lack, but would it possible that for some reasons the related SELECTstatement uses an old snapshot? Regards, -- Daniel
"Daniel Caune" <daniel.caune@ubisoft.com> writes: > It seems that, in certain condition, row (199,84) is shadowing row > (3702,85); This would be the expected behavior if row (199,84) were an updated version of row (3702,85), but you couldn't see it yet in your current transaction snapshot. A plain SELECT would show the older version (the current one according to the snapshot) while SELECT FOR UPDATE would show the newest committed version. I think you must have somehow got a corrupt-data situation with respect to the commit status of these rows, but it's not real clear how. Would you show us the xmin and xmax of the rows, and also the current transaction counter? (pg_controldata will give you a close-enough idea of the latter.) regards, tom lane
Tom Lane wrote: > "Daniel Caune" <daniel.caune@ubisoft.com> writes: > > It seems that, in certain condition, row (199,84) is shadowing row > > (3702,85); > > This would be the expected behavior if row (199,84) were an updated > version of row (3702,85), but you couldn't see it yet in your current > transaction snapshot. A plain SELECT would show the older version > (the current one according to the snapshot) while SELECT FOR UPDATE > would show the newest committed version. Hmm. We've been studying a case on one customer where xmin/xmax seem to be corrupted. It has had ups and downs because I have my doubts about their storage system, but I'm not completely sure that it can be really blamed. This is on 8.1.10. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4" Major Fambrough: You wish to see the frontier? John Dunbar: Yes sir, before it's gone.