Thread: BUG #2379: Duplicate pkeys in table
The following bug has been logged online: Bug reference: 2379 Logged by: Philip Warner Email address: pjw@rhyme.com.au PostgreSQL version: 8.0.7 Operating system: FreeBSD Description: Duplicate pkeys in table Details: We have an intermittent bug that occurs on a table which is updated several times per second. The bug occurs every few days/weeks. It is usually preceeded by a "tuple concurrently updated" messages, but I could not swear it is always preceeded by it. The result of the bug is demonstrated by: select id,count(*) from xxx group by id having count(*)>1; id | count -------+------- 24613 | 6 (1 row) (where ID is the pkey (and xxx is not the real table name)). The row concerned is one of the most frequestly updated rows, and shows only one row when I do a 'select * from xxx where id = 24613'. If I do a select id from xxx order by id, I get 6 duplicate rows. We have tried a reindex: mail=# reindex table xxx; ERROR: could not create unique index DETAIL: Table contains duplicated values. All of which seems to suggest that there really are 6 data rows with the same pkey. None of our code changes the pkey, but we do have several places that update this table intriggers. Any help or suggestions would be greatly appreciated.
Philip Warner wrote: > We have an intermittent bug that occurs on a table which is updated several > times per second. The bug occurs every few days/weeks. It is usually > preceeded by a "tuple concurrently updated" messages, but I could not swear > it is always preceeded by it. > > The result of the bug is demonstrated by: > > select id,count(*) from xxx group by id having count(*)>1; > id | count > -------+------- > 24613 | 6 > (1 row) Please do a SELECT xmin, xmax, cmin, cmax FROM xxx where id = 24613; if you still have that particular manifestation. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Apr 06, 2006 at 08:12:31AM -0400, Alvaro Herrera wrote: > Please do a > > SELECT xmin, xmax, cmin, cmax FROM xxx where id = 24613; > > if you still have that particular manifestation. Also, you'll probably need to set enable_indexscan to off prior to running the above query. -- Michael Fuhr
Michael Fuhr wrote: > On Thu, Apr 06, 2006 at 08:12:31AM -0400, Alvaro Herrera wrote: > >> Please do a >> >> SELECT xmin, xmax, cmin, cmax FROM xxx where id = 24613; >> # set enable_indexscan=off; # SELECT xmin, xmax, cmin, cmax FROM xxx where id = 24613; xmin | xmax | cmin | cmax ----------+----------+------+---------- 32902771 | 0 | 20 | 32902872 32902771 | 0 | 20 | 32902872 32902771 | 0 | 20 | 32902872 32902771 | 0 | 20 | 32902872 32902771 | 0 | 20 | 32902872 32902771 | 33048159 | 20 | 20 (6 rows)
Philip Warner <pjw@rhyme.com.au> writes: > # set enable_indexscan=off; > # SELECT xmin, xmax, cmin, cmax FROM xxx where id = 24613; > xmin | xmax | cmin | cmax > ----------+----------+------+---------- > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 33048159 | 20 | 20 > (6 rows) For completeness, could we also see ctid in that query? regards, tom lane
Philip Warner wrote: > # set enable_indexscan=off; > # SELECT xmin, xmax, cmin, cmax FROM xxx where id = 24613; > xmin | xmax | cmin | cmax > ----------+----------+------+---------- > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 0 | 20 | 32902872 > 32902771 | 33048159 | 20 | 20 > (6 rows) Ugh. Do the triggers involved have EXCEPTION clauses? (I assume they are written in PL/pgSQL -- are there any in other languages?) -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Tom Lane wrote: > For completeness, could we also see ctid in that query? mail=# set enable_indexscan=off; mail=# SELECT xmin, xmax, cmin, cmax,ctid FROM xxx where id = 24613; xmin | xmax | cmin | cmax | ctid ----------+----------+------+----------+--------- 32902771 | 0 | 20 | 32902872 | (0,7) 32902771 | 0 | 20 | 32902872 | (2,27) 32902771 | 0 | 20 | 32902872 | (58,27) 32902771 | 0 | 20 | 32902872 | (60,28) 32902771 | 0 | 20 | 32902872 | (69,3) 32902771 | 33048159 | 20 | 20 | (72,27) (6 rows)
Alvaro Herrera wrote: > Do the triggers involved have EXCEPTION clauses? (I assume they are > written in PL/pgSQL -- are there any in other languages?) Triggers that update this table are in pl/pgsql, and can *raise* exceptions (using RAISE) if that is what you mean. They do not handle them -- is that even possible on pl/pgsql? Other triggers (for slony) are written in plpgsql and C, and I know some of the pl/pgsql triggers raise exceptions.
Philip Warner <pjw@rhyme.com.au> writes: > mail=# set enable_indexscan=off; > mail=# SELECT xmin, xmax, cmin, cmax,ctid FROM xxx where id = 24613; > xmin | xmax | cmin | cmax | ctid > ----------+----------+------+----------+--------- > 32902771 | 0 | 20 | 32902872 | (0,7) > 32902771 | 0 | 20 | 32902872 | (2,27) > 32902771 | 0 | 20 | 32902872 | (58,27) > 32902771 | 0 | 20 | 32902872 | (60,28) > 32902771 | 0 | 20 | 32902872 | (69,3) > 32902771 | 33048159 | 20 | 20 | (72,27) > (6 rows) The "cmax" values in the first 5 rows are evidently really xvac values, ie, these have all been moved by VACUUM FULL. (I assume you run VACUUM FULL regularly on this table?) The thing that is striking though is that the xmin/cmin values are all the same, indicating that all six tuples were inserted by the same command. That seems pretty odd. Can you show us the procedure by which rows are inserted in this table? Also, the last tuple has either been deleted or locked-for-update by transaction 33048159; if it were an attempted deletion we'd have to conclude that 33048159 failed to commit. Do you use SELECT FOR UPDATE on this table? BTW, which of these rows is selected by an indexscan-enabled query, ie, set enable_indexscan=on then repeat same query? regards, tom lane
Tom Lane wrote: > Philip Warner <pjw@rhyme.com.au> writes: > >> mail=# set enable_indexscan=off; >> mail=# SELECT xmin, xmax, cmin, cmax,ctid FROM xxx where id = 24613; >> xmin | xmax | cmin | cmax | ctid >> ----------+----------+------+----------+--------- >> 32902771 | 0 | 20 | 32902872 | (0,7) >> 32902771 | 0 | 20 | 32902872 | (2,27) >> 32902771 | 0 | 20 | 32902872 | (58,27) >> 32902771 | 0 | 20 | 32902872 | (60,28) >> 32902771 | 0 | 20 | 32902872 | (69,3) >> 32902771 | 33048159 | 20 | 20 | (72,27) >> (6 rows) >> > > The "cmax" values in the first 5 rows are evidently really xvac values, > ie, these have all been moved by VACUUM FULL. (I assume you run VACUUM > FULL regularly on this table?) Yes, every minute. Table has about 1500 rows and grows *very* fast due to updates. > The thing that is striking though is > that the xmin/cmin values are all the same, indicating that all six > tuples were inserted by the same command. That seems pretty odd. Can > you show us the procedure by which rows are inserted in this table? > The original insertion is probably not relevant (it happened months ago); there are many places that update the table. And for the specific row in question, it was probably inserted directly by psql. Other rows exhibit this problem (less often), were usually inserted by a long pgsql procedure. Updates happen regularly from many sources, but the procedure that does the most updates is a trigger. Do you want to see that? > Also, the last tuple has either been deleted or locked-for-update by > transaction 33048159; if it were an attempted deletion we'd have to > conclude that 33048159 failed to commit. Do you use SELECT FOR UPDATE > on this table? > No. But when a new row is added, I do lock the table in exclusive mode: Lock Table xxx In Exclusive Mode; The specific row in these examples will never be deleted. > BTW, which of these rows is selected by an indexscan-enabled query, > ie, set enable_indexscan=on then repeat same query? > xmin | xmax | cmin | cmax | ctid ----------+----------+------+------+--------- 32902771 | 33048159 | 20 | 20 | (72,27)
Philip Warner <pjw@rhyme.com.au> writes: > Tom Lane wrote: >> The thing that is striking though is >> that the xmin/cmin values are all the same, indicating that all six >> tuples were inserted by the same command. That seems pretty odd. Can >> you show us the procedure by which rows are inserted in this table? >> > The original insertion is probably not relevant (it happened months > ago); there are many places that update the table. OK, what I should say is that all these tuples were updated by the same command. > Updates happen regularly from many sources, but the procedure that does > the most updates is a trigger. Do you want to see that? Please. Also, if you care to run pg_filedump -i -F over the table, it'd be interesting to see the complete header info for each of these tuples. regards, tom lane
Tom Lane wrote: >> Updates happen regularly from many sources, but the procedure that does >> the most updates is a trigger. Do you want to see that? >> > > Please. > public | tg_update_qqq_date | "trigger" | | mail | plpgsql | Declare uid bigint; Begin uid = (select owner_id from yyy m where m.f1 = NEW.f1); if (uid <> 0 and not uid is null) then update xxx set qqq_date = 'now' where id=uid; end if; Return NEW; End; | and there's also a rewrite rule: zzz_update_r1 AS ON UPDATE TO zzz DO INSTEAD UPDATE xxx SET f1 = new.f1 WHERE xxx.id = new.id zzz_update_r2 AS ON UPDATE TO zzz DO INSTEAD UPDATE xxx SET f2 = new.f2 WHERE xxx.id = new.id > Also, if you care to run pg_filedump -i -F over the table, it'd be > interesting to see the complete header info for each of these tuples. > obviously from different blocks (do you need more details?): Item 7 -- Length: 168 Offset: 3920 (0x0f50) Flags: USED XMIN: 32902771 CMIN: 20 XMAX: 0 CMAX|XVAC: 32902872 Block Id: 0 linp Index: 7 Attributes: 34 Size: 36 infomask: 0x2913 (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 [4]: 0x00 0f50: 730ef601 14000000 00000000 d80ef601 s............... 0f60: 00000000 07002200 1329249f 807e8400 ......"..)$..~.. 0f70: d37e0000 25600000 00000000 09000000 .~..%`.......... 0f80: 00000000 00000000 00000000 00000000 ................ 0f90: 00000000 00000000 04000000 12bcf968 ...............h 0fa0: d28fa741 22000000 5f5f4021 696e7465 ...A"...__@!inte 0fb0: 726e616c 5f64656c 69766572 795f7573 rnal_delivery_us 0fc0: 65722140 5f5f0000 00000000 00000000 er!@__.......... 0fd0: 01000000 00000000 3c307819 0e1fa441 ........<0x....A 0fe0: 3c307819 0e1fa441 3c307819 0e1fa441 <0x....A<0x....A 0ff0: 00000000 00000000 ........ Item 27 -- Length: 168 Offset: 2700 (0x0a8c) Flags: USED XMIN: 32902771 CMIN: 20 XMAX: 0 CMAX|XVAC: 32902872 Block Id: 2 linp Index: 27 Attributes: 34 Size: 36 infomask: 0x2913 (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 [4]: 0x00 0a8c: 730ef601 14000000 00000000 d80ef601 s............... 0a9c: 00000200 1b002200 1329249f 807e8400 ......"..)$..~.. 0aac: d37e0000 25600000 00000000 09000000 .~..%`.......... 0abc: 00000000 00000000 00000000 00000000 ................ 0acc: 00000000 00000000 04000000 12bcf968 ...............h 0adc: d28fa741 22000000 5f5f4021 696e7465 ...A"...__@!inte 0aec: 726e616c 5f64656c 69766572 795f7573 rnal_delivery_us 0afc: 65722140 5f5f0000 00000000 00000000 er!@__.......... 0b0c: 01000000 00000000 3c307819 0e1fa441 ........<0x....A 0b1c: 3c307819 0e1fa441 3c307819 0e1fa441 <0x....A<0x....A 0b2c: 00000000 00000000 ........ Item 27 -- Length: 168 Offset: 7724 (0x1e2c) Flags: USED XMIN: 32902771 CMIN: 20 XMAX: 0 CMAX|XVAC: 32902872 Block Id: 58 linp Index: 27 Attributes: 34 Size: 36 infomask: 0x2913 (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 [4]: 0x00 1e2c: 730ef601 14000000 00000000 d80ef601 s............... 1e3c: 00003a00 1b002200 1329249f 807e8400 ..:..."..)$..~.. 1e4c: d37e0000 25600000 00000000 09000000 .~..%`.......... 1e5c: 00000000 00000000 00000000 00000000 ................ 1e6c: 00000000 00000000 04000000 12bcf968 ...............h 1e7c: d28fa741 22000000 5f5f4021 696e7465 ...A"...__@!inte 1e8c: 726e616c 5f64656c 69766572 795f7573 rnal_delivery_us 1e9c: 65722140 5f5f0000 00000000 00000000 er!@__.......... 1eac: 01000000 00000000 3c307819 0e1fa441 ........<0x....A 1ebc: 3c307819 0e1fa441 3c307819 0e1fa441 <0x....A<0x....A 1ecc: 00000000 00000000 ........ Item 28 -- Length: 168 Offset: 8024 (0x1f58) Flags: USED XMIN: 32902771 CMIN: 20 XMAX: 0 CMAX|XVAC: 32902872 Block Id: 60 linp Index: 28 Attributes: 34 Size: 36 infomask: 0x2913 (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 [4]: 0x00 1f58: 730ef601 14000000 00000000 d80ef601 s............... 1f68: 00003c00 1c002200 1329249f 807e8400 ..<..."..)$..~.. 1f78: d37e0000 25600000 00000000 09000000 .~..%`.......... 1f88: 00000000 00000000 00000000 00000000 ................ 1f98: 00000000 00000000 04000000 12bcf968 ...............h 1fa8: d28fa741 22000000 5f5f4021 696e7465 ...A"...__@!inte 1fb8: 726e616c 5f64656c 69766572 795f7573 rnal_delivery_us 1fc8: 65722140 5f5f0000 00000000 00000000 er!@__.......... 1fd8: 01000000 00000000 3c307819 0e1fa441 ........<0x....A 1fe8: 3c307819 0e1fa441 3c307819 0e1fa441 <0x....A<0x....A 1ff8: 00000000 00000000 ........ Item 3 -- Length: 168 Offset: 8024 (0x1f58) Flags: USED XMIN: 32902771 CMIN: 20 XMAX: 0 CMAX|XVAC: 32902872 Block Id: 69 linp Index: 3 Attributes: 34 Size: 36 infomask: 0x2913 (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 [4]: 0x00 1f58: 730ef601 14000000 00000000 d80ef601 s............... 1f68: 00004500 03002200 1329249f 807e8400 ..E..."..)$..~.. 1f78: d37e0000 25600000 00000000 09000000 .~..%`.......... 1f88: 00000000 00000000 00000000 00000000 ................ 1f98: 00000000 00000000 04000000 12bcf968 ...............h 1fa8: d28fa741 22000000 5f5f4021 696e7465 ...A"...__@!inte 1fb8: 726e616c 5f64656c 69766572 795f7573 rnal_delivery_us 1fc8: 65722140 5f5f0000 00000000 00000000 er!@__.......... 1fd8: 01000000 00000000 3c307819 0e1fa441 ........<0x....A 1fe8: 3c307819 0e1fa441 3c307819 0e1fa441 <0x....A<0x....A 1ff8: 00000000 00000000 ........ Item 27 -- Length: 168 Offset: 8024 (0x1f58) Flags: USED XMIN: 32902771 CMIN: 20 XMAX: 33048159 CMAX|XVAC: 20 Block Id: 318 linp Index: 6 Attributes: 34 Size: 36 infomask: 0x2913 (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 [4]: 0x00 1f58: 730ef601 14000000 5f46f801 14000000 s......._F...... 1f68: 00003e01 06002200 1329249f 807e8400 ..>..."..)$..~.. 1f78: d37e0000 25600000 00000000 09000000 .~..%`.......... 1f88: 00000000 00000000 00000000 00000000 ................ 1f98: 00000000 00000000 04000000 12bcf968 ...............h 1fa8: d28fa741 22000000 5f5f4021 696e7465 ...A"...__@!inte 1fb8: 726e616c 5f64656c 69766572 795f7573 rnal_delivery_us 1fc8: 65722140 5f5f0000 00000000 00000000 er!@__.......... 1fd8: 01000000 00000000 3c307819 0e1fa441 ........<0x....A 1fe8: 3c307819 0e1fa441 3c307819 0e1fa441 <0x....A<0x....A 1ff8: 00000000 00000000 ........
Philip Warner <pjw@rhyme.com.au> writes: > public | tg_update_qqq_date | "trigger" > | | mail | plpgsql | > Declare > uid bigint; > Begin > uid = (select owner_id from yyy m where m.f1 = NEW.f1); > if (uid <> 0 and not uid is null) then > update xxx set qqq_date = 'now' where id=uid; > end if; > Return NEW; > End; | > and there's also a rewrite rule: > zzz_update_r1 AS > ON UPDATE TO zzz DO INSTEAD UPDATE xxx SET f1 = new.f1 > WHERE xxx.id = new.id > zzz_update_r2 AS > ON UPDATE TO zzz DO INSTEAD UPDATE xxx SET f2 = new.f2 > WHERE xxx.id = new.id OK, I'm a bit confused by the obfuscation here. The table with the duplicates is xxx, or qqq? Which one is the trigger attached to? zzz is a view I suppose? regards, tom lane
Tom Lane wrote: > OK, I'm a bit confused by the obfuscation here. The table with the > duplicates is xxx, or qqq? Yes, xxx is the broken table. The two rewrite rules map updates on a view to an underlying table (the broken one). Updates on the view occur very frequently. Perhaps 400,000 per day? And, sadly, 200,000 of those on this one row. > Which one is the trigger attached to? > zzz is a view I suppose? > The trigger is attached to yet another table; when an insert or update occurs in this third table, a date field on xxx is updated. Updates/inserts on this happen less frequently -- perhaps low 10's of thousands per day. I had forgotten about the view/rewrite rules until I looked again at the db source (I thought they were triggers). These are the source of 90% of the updates.
Tom Lane wrote: > OK, I'm a bit confused by the obfuscation here. The table with the > duplicates is xxx, or qqq? Possibly less obscure version: public | tg_update_anotherTable_date | "trigger" | | mail | plpgsql | Declare uid bigint; Begin uid = (select owner_id from anotherNotherTable m where m.keyField = NEW.fkField); if (uid <> 0 and not uid is null) then update brokenTable set some_date = 'now' where id=uid; end if; Return NEW; End; | and there's also a rewrite rule: aView_update_r1 AS ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f1 = new.f1 WHERE brokenTable.id = new.id aView_update_r2 AS ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f2 = new.f2 WHERE brokenTable.id = new.id
Philip Warner <pjw@rhyme.com.au> writes: > aView_update_r1 AS > ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f1 = new.f1 > WHERE brokenTable.id = new.id > aView_update_r2 AS > ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f2 = new.f2 > WHERE brokenTable.id = new.id OK, that's a bit clearer. It'd help to see the view definition too. regards, tom lane
Tom Lane wrote: > Philip Warner <pjw@rhyme.com.au> writes: > >> aView_update_r1 AS >> ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f1 = new.f1 >> WHERE brokenTable.id = new.id >> aView_update_r2 AS >> ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f2 = new.f2 >> WHERE brokenTable.id = new.id >> > > OK, that's a bit clearer. It'd help to see the view definition too. > Nothing special, just some 'fake' fields and no join (mainly there for access control and compatibility across several applications): SELECT brokenTable.id, brokenTable.someField1 AS someAlias1, brokenTable."someField2"::character varying(64) AS someAlias2, 0::bigint AS someAlias3, brokenTable.someField4, 'crypt'::character varying AS someAlias5, brokenTable.fieldUpdatedByTrigger, brokenTable.fieldUpdatedByRule1 FROM brokenTable;
Oops. Minor change. Last two fields are updated by rules. Tom Lane wrote: > Philip Warner <pjw@rhyme.com.au> writes: > >> aView_update_r1 AS >> ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f1 = new.f1 >> WHERE brokenTable.id = new.id >> aView_update_r2 AS >> ON UPDATE TO aView DO INSTEAD UPDATE brokenTable SET f2 = new.f2 >> WHERE brokenTable.id = new.id >> > > OK, that's a bit clearer. It'd help to see the view definition too. > Nothing special, just some 'fake' fields and no join (mainly there for access control and compatibility across several applications): SELECT brokenTable.id, brokenTable.someField1 AS someAlias1, brokenTable."someField2"::character varying(64) AS someAlias2, 0::bigint AS someAlias3, brokenTable.someField4, 'crypt'::character varying AS someAlias5, brokenTable.fieldUpdatedByRule1, brokenTable.fieldUpdatedByRule2 FROM brokenTable;
Philip Warner wrote: > Item 7 -- Length: 168 Offset: 3920 (0x0f50) Flags: USED > XMIN: 32902771 CMIN: 20 XMAX: 0 CMAX|XVAC: 32902872 > Block Id: 0 linp Index: 7 Attributes: 34 Size: 36 > infomask: 0x2913 > (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) > t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 > [4]: 0x00 Hmm, shouldn't we see the MOVED_OFF bit set also if the cmax/xvac field is actually xvac? > Item 27 -- Length: 168 Offset: 8024 (0x1f58) Flags: USED > XMIN: 32902771 CMIN: 20 XMAX: 33048159 CMAX|XVAC: 20 > Block Id: 318 linp Index: 6 Attributes: 34 Size: 36 > infomask: 0x2913 > (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) > t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 > [4]: 0x00 I'm confused -- the original report showed this tuple with ctid (72,27), but this seems to be in a different block? What's the explanation for this tuple to have cmin=cmax? Is this normal? Sorry I have only questions :-( -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Philip Warner wrote: >> Item 7 -- Length: 168 Offset: 3920 (0x0f50) Flags: USED >> XMIN: 32902771 CMIN: 20 XMAX: 0 CMAX|XVAC: 32902872 >> Block Id: 0 linp Index: 7 Attributes: 34 Size: 36 >> infomask: 0x2913 >> (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) >> t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 >> [4]: 0x00 > Hmm, shouldn't we see the MOVED_OFF bit set also if the cmax/xvac field > is actually xvac? These are probably MOVED_IN not MOVED_OFF tuples; VACUUM FULL clears their MOVED_IN flags at the end (see update_hint_bits()), which explains the lack of any flag being set. >> Item 27 -- Length: 168 Offset: 8024 (0x1f58) Flags: USED >> XMIN: 32902771 CMIN: 20 XMAX: 33048159 CMAX|XVAC: 20 >> Block Id: 318 linp Index: 6 Attributes: 34 Size: 36 >> infomask: 0x2913 >> (HASNULL|HASVARWIDTH|HASOID|XMIN_COMMITTED|XMAX_INVALID|UPDATED) >> t_bits: [0]: 0x9f [1]: 0x80 [2]: 0x7e [3]: 0x84 >> [4]: 0x00 > I'm confused -- the original report showed this tuple with ctid (72,27), > but this seems to be in a different block? The blockid/linp fields are pointing to the updated version of the row that xid 33048159 tried to create. It's a plausible theory that the reason 33048159 failed (as it evidently did, XMAX_INVALID) is that it failed on insertion of a duplicate key into the pkey index. It is interesting that the CMIN and CMAX are the same, but that probably just says that 32902771 and 33048159 were instances of the exact same SQL command pattern, and so their attempted updates both came 20 commands into the transaction. Philip suggested to me off-list that the initial error may have been the VACUUM FULL (xid 32902872) creating duplicate moved copies of a single valid row. That seems plausible because VACUUM FULL suppresses duplicate-index checks, and it's real hard to see any other way that a single transaction could have inserted all of these tuples without triggering the btree duplicate-key check (barring a completely corrupt index anyway). I wouldn't be surprised in the least bit to find another corner-case bug in the VACUUM FULL tuple chain moving code ... that bit of spaghetti has been too complex since day one ... regards, tom lane
Tom Lane wrote: > Philip suggested to me off-list that the initial error may have been the > VACUUM FULL (xid 32902872) creating duplicate moved copies of a single > valid row. That seems plausible because VACUUM FULL suppresses > duplicate-index checks, and it's real hard to see any other way that a > single transaction could have inserted all of these tuples without > triggering the btree duplicate-key check (barring a completely corrupt > index anyway). Another interesting factor -- these problems have not yet happened on any replicated DB. Slony replication works by using a trigger to store data changes in a log table; these changes are then applied on the destination DB. Slony also disables triggers on the destination. Because of the update load we also run the same vacuum regime. This seems to make it unlikely that vacuum is the sole culprit. Another difference is that the replicated DB is only ever updated. Virtually no load from applications reading data. I wonder if shared buffers in some way also interact here.