Scott Marlowe <scott.marlowe@gmail.com> writes:
> If you're de-duping a whole table, no need to create indexes, as it's
> gonna have to hit every row anyway. Fastest way I've found has been:
> select a,b,c into newtable from oldtable group by a,b,c;
> On pass, done.
> If you want to use less than the whole row, you can use select
> distinct on (col1, col2) * into newtable from oldtable;
Also, the DISTINCT ON method can be refined to control which of a set of
duplicate keys is retained, if you can identify additional columns that
constitute a preference order for retaining/discarding dupes. See the
"latest weather reports" example in the SELECT reference page.
In any case, it's advisable to crank up work_mem while performing this
operation.
regards, tom lane