Home > mailing lists

Re: Duplicate data despite unique constraint - Mailing list pgsql-general

From	Tom Lane
Subject	Re: Duplicate data despite unique constraint
Date	September 11, 2016 17:32:10
Msg-id	23007.1473604324@sss.pgh.pa.us Whole thread Raw
In response to	Re: Duplicate data despite unique constraint (Daniel Caldeweyher <dcalde@gmail.com>)
List	pgsql-general

Tree view

Daniel Caldeweyher <dcalde@gmail.com> writes:
> I had a similar issue once and was able to recover from it. If this affects
> only some rows and you are able to identify them, this is fixable:

> --force table scan to skip using corrupt index
> set enable_seqscan=1
> set enable_indexscan=0
> set enable_bitmapscan=0
> select email,count(*)
> from users
> group by email
> having count(*) > 1;

> Then, if the rows are simply just duplicates and have no other changes, add
> a new serial column (or to prevent blocking, add a bigint column and update
> with sequential values), then using the emails from above, delete the ones
> with the higher/lower sequence number. Ensure you are still skipping
> indexes.

FWIW, that's kind of the hard way; you can just use the "ctid" system
column as a guaranteed-unique row identifier, instead of having to add
and remove a column.  The latter is very expensive, and if there is any
actual data corruption in the heap it could make things worse.

select ctid, * from users where email = 'duplicated value';
-- choose which row(s) to nuke, then
delete from users where ctid = 'chosen value';

You definitely want to reindex after the data cleanup, since presumably
it's corruption of a unique index that got you into this mess in the
first place.  But as long as it's only the index and not the table that's
damaged, recovery is pretty straightforward.

            regards, tom lane

pgsql-general by date:

From: Christian Convey
Date: 11 September 2016, 16:47:23
Subject: Re: C++ port of Postgres

From: Patrick B
Date: 12 September 2016, 02:17:57
Subject: large number dead tup - Postgres 9.5

Re: Duplicate data despite unique constraint - Mailing list pgsql-general

Previous

Next