Home > mailing lists

SQL for removing duplicates? - Mailing list pgsql-novice

From
Subject	SQL for removing duplicates?
Date	June 13, 2006 13:12:15
Msg-id	200606131611.k5DGBff21012@panix3.panix.com Whole thread
Responses	Re: SQL for removing duplicates?
List	pgsql-novice

Tree view



Hi.  I'm stumped.  I have a large table (about 8.5M records), let's
call it t, whose columns include x and y.  I want to remove records
from this table so that any pair of values for these two fields appear
only once.  (This will get rid of about 15% of the records in t.)

One simple solution would be something like

  CREATE TABLE tmp AS SELECT DISTINCT ON ( x, y ) * FROM t;
  DROP TABLE t;
  ALTER TABLE tmp RENAME TO t;

This works, but it uses a lot of space.  I would prefer to simply cull
the unwanted records from t, but I just can't figure out the SQL for
it.  Any help with it would be *much* appreciated.

Thanks!

kj

pgsql-novice by date:

From: Oisin Glynn
Date: 12 June 2006, 13:31:38
Subject: Re: Scheduled tasks

From: Brad Nicholson
Date: 13 June 2006, 15:26:23
Subject: Re: SQL for removing duplicates?

SQL for removing duplicates? - Mailing list pgsql-novice

Previous

Next