Re: Duplicate values found when reindexing unique index - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Duplicate values found when reindexing unique index
Date
Msg-id 7029.1199121834@sss.pgh.pa.us
Whole thread Raw
In response to Re: Duplicate values found when reindexing unique index  ("Mason Hale" <masonhale@gmail.com>)
Responses Re: Duplicate values found when reindexing unique index  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-bugs
"Mason Hale" <masonhale@gmail.com> writes:
> Tom, I'll send these to you privately.

Thanks.  I don't see anything particularly surprising there though.
What I was wondering about was whether your application was in the
habit of doing repeated no-op updates on the same "entry" row.

The pg_filedump outputs seem to blow away any theory of hardware-level
duplication of the row --- all the tuples on both pages have the
expected block number in their headers, so it seems PG deliberately
put them where they are.  And the two tuples at issue are both marked
UPDATED, so they clearly are updated versions of some now-lost original.

What is not clear is whether they are independent updates of the same
original or whether there was a chain of updates --- that is, was the
newer one (which from the timestamp must be the one in the
lower-numbered block) made by an update from the older one, or from the
lost original?

Since the older one doesn't show any sign of having been updated itself
(in particular, no xmax and its ctid still points to itself), the former
theory would require assuming that the page update "got lost" --- was
discarded without being written to disk.  On the other hand, the latter
theory seems to require a similar assumption with respect to whatever
page held the original.

Given this, and the index corruption you showed before (the wrong
sibling link, which would represent index breakage quite independent of
what was in the heap), and the curious contents of your WAL files
(likewise not explainable by anything going wrong within a table),
I'm starting to think that Occam's razor says you've got hardware
problems.  Or maybe a kernel-level bug that is causing writes to get
discarded.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Duplicate values found when reindexing unique index
Next
From: Tom Lane
Date:
Subject: Re: Duplicate values found when reindexing unique index