Daniel Caldeweyher <dcalde@gmail.com> writes:
> I had a similar issue once and was able to recover from it. If this affects
> only some rows and you are able to identify them, this is fixable:
> --force table scan to skip using corrupt index
> set enable_seqscan=1
> set enable_indexscan=0
> set enable_bitmapscan=0
> select email,count(*)
> from users
> group by email
> having count(*) > 1;
> Then, if the rows are simply just duplicates and have no other changes, add
> a new serial column (or to prevent blocking, add a bigint column and update
> with sequential values), then using the emails from above, delete the ones
> with the higher/lower sequence number. Ensure you are still skipping
> indexes.
FWIW, that's kind of the hard way; you can just use the "ctid" system
column as a guaranteed-unique row identifier, instead of having to add
and remove a column. The latter is very expensive, and if there is any
actual data corruption in the heap it could make things worse.
select ctid, * from users where email = 'duplicated value';
-- choose which row(s) to nuke, then
delete from users where ctid = 'chosen value';
You definitely want to reindex after the data cleanup, since presumably
it's corruption of a unique index that got you into this mess in the
first place. But as long as it's only the index and not the table that's
damaged, recovery is pretty straightforward.
regards, tom lane