Re: Possible bug in vacuum redo - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Possible bug in vacuum redo
Date
Msg-id 24087.1009037614@sss.pgh.pa.us
Whole thread Raw
In response to Re: Possible bug in vacuum redo  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
Responses Re: Possible bug in vacuum redo
List pgsql-hackers
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes:
> AFAIR t_ctid isn't logged in WAL.

After looking at the heap_update code I think you are right.  Doesn't
that render the field completely useless/unreliable?

In the simple heap_update case I think that heap_xlog_update could
easily set the old tuple's t_ctid field correctly.  Not sure how
it works when VACUUM is moving tuple chains around, however.

Another thing I am currently looking at is that I do not believe VACUUM
handles tuple chain moves correctly.  It only enters the chain-moving
logic if it finds a tuple that is in the *middle* of an update chain,
ie, both the prior and next tuples still exist.  In the case of a
two-element update chain (only the latest and next-to-latest tuples of
a row survive VACUUM), AFAICT vacuum will happily move the latest tuple
without ever updating the previous tuple's t_ctid.

In short t_ctid seems extremely unreliable.  I have been trying to work
out a way that a bad t_ctid link could lead to the duplicate-tuple
reports we've been hearing lately, but so far I haven't seen one.  I do
think it can lead to missed UPDATEs in read-committed mode, however.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Hiroshi Inoue"
Date:
Subject: Re: Possible bug in vacuum redo
Next
From: Peter Eisentraut
Date:
Subject: HISTORY file