Thread: Reproducible vacuum complaint!
I have devised a simple manual way of reproducing that peculiar VACUUM notice that Oleg has been complaining about, but didn't have a reliable way of triggering on-demand. It looks like it is caused by some sort of bug in the transaction commit logic --- or maybe just VACUUM's piece of it, but anyway there is something mucho bad going on here. Setup: create table hits (msg_id int, nhits int); create index hits_pkey on hits(msg_id); insert into hits values(42,0); insert into hits values(43,0); Given this setup, you can do drop index hits_pkey; update hits set nhits = nhits+1 where msg_id = 42; create index hits_pkey on hits(msg_id); vacuum analyze hits; all day with no problem. BUT: start up another psql, and in that other psql begin a transaction block and touch anything at all --- doesn't have to be the table under test: begin; select * from int4_tbl; Now, *without committing* that other transaction, go back to the first psql and try again: drop index hits_pkey; update hits set nhits = nhits+1 where msg_id = 42; create index hits_pkey on hits(msg_id); vacuum analyze hits; NOTICE: Index hits_pkey: NUMBER OF INDEX' TUPLES (2) IS NOT THE SAME AS HEAP' (3). Try recreating the index. You can repeat the vacuum (with or without analyze) as often as you want and you'll get the same notice each time. If you do more UPDATEs, the reported number of heap tuples increases --- rather odd, considering there are obviously only two committed tuples in the table (as can be confirmed by a SELECT). As soon as you commit or abort the other transaction, everything goes back to normal. There are variants of this sequence that also cause the problem. The critical factor seems to be that both the index itself and at least one tuple in the table have to be younger than the oldest uncommitted transaction. At this point I decided that I was in over my head, so I'm tossing the whole mess in Vadim's direction. I can't tell whether VACUUM itself is confused or the transaction logic in general is, but it sure looks like something is looking at the wrong xact to decide whether tuples have been committed or not. This could be a symptom of a fairly serious logic error down inside tuple time qual checks... regards, tom lane
> -----Original Message----- > From: owner-pgsql-hackers@postgreSQL.org > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane > Sent: Monday, November 22, 1999 8:00 AM > To: pgsql-hackers@postgreSQL.org > Subject: [HACKERS] Reproducible vacuum complaint! > > > I have devised a simple manual way of reproducing that peculiar VACUUM > notice that Oleg has been complaining about, but didn't have a reliable > way of triggering on-demand. It looks like it is caused by some sort of > bug in the transaction commit logic --- or maybe just VACUUM's piece of > it, but anyway there is something mucho bad going on here. > > Setup: > > create table hits (msg_id int, nhits int); > create index hits_pkey on hits(msg_id); > insert into hits values(42,0); > insert into hits values(43,0); > > Given this setup, you can do > > drop index hits_pkey; > update hits set nhits = nhits+1 where msg_id = 42; > create index hits_pkey on hits(msg_id); > vacuum analyze hits; > > all day with no problem. > > BUT: start up another psql, and in that other psql begin a transaction > block and touch anything at all --- doesn't have to be the table under > test: > > begin; > select * from int4_tbl; > > Now, *without committing* that other transaction, go back to the first > psql and try again: > > drop index hits_pkey; > update hits set nhits = nhits+1 where msg_id = 42; > create index hits_pkey on hits(msg_id); > vacuum analyze hits; > NOTICE: Index hits_pkey: NUMBER OF INDEX' TUPLES (2) IS NOT THE > SAME AS HEAP' (3). > Try recreating the index. > Hmm,if "select * .." runs in SERIALIZABLE isolation level,the transaction would see an old "msg_id=42" tuple(not new one). So vacuum doesn't vanish the old "msg_id=42" tuple. Vacuum takes all running transactions into account. But AFAIK,there's no other such stuff. CREATE INDEX may be another one which should take all running transactions into account. Regards. Hiroshi Inoue Inoue@tpf.co.jp
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > Hmm,if "select * .." runs in SERIALIZABLE isolation level,the transaction > would see an old "msg_id=42" tuple(not new one). So vacuum doesn't > vanish the old "msg_id=42" tuple. Vacuum takes all running transactions > into account. But AFAIK,there's no other such stuff. > CREATE INDEX may be another one which should take all running > transactions into account. Oh, I think I see --- you mean that CREATE INDEX needs to make index entries for tuples that are committed dead but might still be visible to some running transaction somewhere. Yes, that seems to fit what I was seeing. VACUUM always complained that there were too few index entries, never too many. It looks like btbuild() only indexes tuples that satisfy SnapshotNow, so this is definitely a potential problem for btree indexes. The other index types are likely broken in the same way... Comments anyone? What time qual should btbuild and friends be using, if not that? regards, tom lane
Tom Lane wrote: > > "Hiroshi Inoue" <Inoue@tpf.co.jp> writes: > > Hmm,if "select * .." runs in SERIALIZABLE isolation level,the transaction > > would see an old "msg_id=42" tuple(not new one). So vacuum doesn't > > vanish the old "msg_id=42" tuple. Vacuum takes all running transactions > > into account. But AFAIK,there's no other such stuff. > > CREATE INDEX may be another one which should take all running > > transactions into account. ... > It looks like btbuild() only indexes tuples that satisfy SnapshotNow, > so this is definitely a potential problem for btree indexes. The other > index types are likely broken in the same way... > > Comments anyone? What time qual should btbuild and friends be using, > if not that? Seems that we need in new #define SnapshotAny ((Snapshot) 0x2) and new HeapTupleSatisfiesAny() returning TRUE for any tuple with valid and committed (or current xact id) t_xmin. -:( Sorry, I missed CREATE INDEX case. Vadim P.S. I'll comment about indices and vacuum latter...
When I vacuum the database (PostgreSQL 6.5.3 on SuSE 6.3 Linux, 2.2 kernel), I get the following error message: ERROR: HEAP_MOVED_IN was not expected. vacuumdb: database vacuum failed on ntis This error only seems to occur after I have used the trim function to clean up one of the rows in the msg table of a database called ntis: ntis=>update msg set description = trim(description); UPDATE 12069 ntis=> To try and track down the problem, I wrote a C program (using ecpg) that trimmed the table one row at a time and vacuumed between each row operation. I was hoping that this program would reveal a problem with the data in one of my records. Unfortunately the one row at a time approach did not reveal the problem and each vacuum operated without error. Can anyone tell me what a HEAP_MOVED_IN error is - I checked the source but was not familiar enough to understand it? Any ideas on why trim() may have cause it?
Stephen Birch <sbirch@ironmountainsystems.com> writes: > When I vacuum the database (PostgreSQL 6.5.3 on SuSE 6.3 Linux, 2.2 > kernel), I get the following error message: > ERROR: HEAP_MOVED_IN was not expected. > Can anyone tell me what a HEAP_MOVED_IN error is - I checked the > source but was not familiar enough to understand it? Any ideas on why > trim() may have cause it? When VACUUM moves a tuple from one disk page to another (to compact the table), the original tuple is marked HEAP_MOVED_OFF and the copy is marked HEAP_MOVED_IN temporarily, until the VACUUM is ready to commit. This is supposed to ensure that a failure partway through VACUUM won't corrupt your table by leaving you with two copies of the same tuple. (The HEAP_MOVED_OFF copy is valid until VACUUM commits, and the HEAP_MOVED_IN copy is valid afterwards.) I haven't heard of other reports of this error message, so I suspect you have found some hard-to-hit boundary condition error in VACUUM's data-shuffling logic. I guess that the reason you don't see the error after a single trim() is that not very much data-shuffling is needed to compact the table after just one tuple update. What we need is a reproducible test case so we can chase down the bug --- any chance you can provide one? regards, tom lane