Re: Hmmm ... isn't count_nondeletable_pages all wet? - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Hmmm ... isn't count_nondeletable_pages all wet? |
Date | |
Msg-id | 29644.1189904237@sss.pgh.pa.us Whole thread Raw |
In response to | Hmmm ... isn't count_nondeletable_pages all wet? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Hmmm ... isn't count_nondeletable_pages all wet?
Re: Hmmm ... isn't count_nondeletable_pages all wet? |
List | pgsql-hackers |
I wrote: > Is this analysis accurate, or am I missing something? If it is > accurate, do we need to postpone the upcoming releases to fix it? > I am thinking that some previously unexplained reports of index > corruption might now be explained ... Yeah, it's broken. Reproducing the race condition is a bit tricky, but if you have a debug-enabled build at hand it's simple. Here's a test case (turn off autovacuum if it's on, to prevent it from getting in there ahead of the manual vacuum): In session 1: regression=# create table foo (f1 int primary key); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo" CREATE TABLE regression=# insert into foo select g from generate_series(1,10) g; INSERT 0 10 regression=# delete from foo; DELETE 10 Now start session 2, and attach to it with gdb, and set a breakpoint at lazy_truncate_heap(). Continue, and in session 2 do regression=# vacuum verbose foo; INFO: vacuuming "public.foo" INFO: scanned index "foo_pkey" to remove 10 row versions DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "foo": removed 10 row versions in 1 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "foo_pkey" now contains 0 row versions in 2 pages DETAIL: 10 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "foo": found 10 removable, 0 nonremovable row versions in 1 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. 1 pages contain useful free space. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. [ hangs here at the breakpoint ] In session 1 do: regression=# begin; BEGIN regression=# insert into foo select g+100 from generate_series(1,10) g; INSERT 0 10 regression=# abort; ROLLBACK Now let gdb continue from the breakpoint, and notice session 2 thinks it can truncate away the whole table: INFO: "foo": truncated 1 to 0 pages DETAIL: CPU 0.00s/0.00u sec elapsed 0.01 sec. VACUUM regression=# Back to session 1, insert some conflicting data: regression=# insert into foo select g+1000 from generate_series(1,10) g; INSERT 0 10 and now we have a corrupt index with multiple pointers to these rows: regression=# select * from foo where f1 = 105; f1 ------1005 (1 row) regression=# select * from foo where f1 = 1005; f1 ------1005 (1 row) I am fairly sure that this bug explains problems previously reported by Merlin Moncure: http://archives.postgresql.org/pgsql-general/2006-10/msg01312.php and Florian Weimer: http://archives.postgresql.org/pgsql-general/2006-11/msg00305.php In both those cases, off-list investigation showed that the symptoms were caused by multiple index entries pointing to the same heap tuples, where one index entry matched the actual contents of the row and the other did not. In both cases this occurred for a fairly small number of rows that were clumped together into small ranges of blocks. It looks to me like this is perfectly explained by the theory that that range of blocks had been truncated away by a VACUUM at some point in the table's history, and that the non-matching index entries stemmed from an insert or update that occurred and then aborted after VACUUM had examined the blocks the first time but before it could return to check whether the blocks were still empty. It's a corner case, but I say it's a must-fix. Those bug reports have been bothering me for most of a year ... regards, tom lane
pgsql-hackers by date: