Thread: archives is not working was (Re: [NOVICE] Removing duplicate keys)

archives is not working was (Re: [NOVICE] Removing duplicate keys)

From
Richard Broersma Jr
Date:
> I have been trying to open the link and seems that
> postgresql archive site is down.
> http://archives.postgresql.org/pgsql-novice/2006-06/msg00092.php

Sorry, the link didn't work.  It worked yesterday when I found it.

I will also forward this to the general list maybe someone there could take a look at it.

Regards,

Richard Broersma Jr.


Re: archives is not working was (Re: [NOVICE] Removing duplicate keys)

From
Ray Stell
Date:
fixed now:

    * From: <kynn ( at ) panix ( dot ) com>
    * To: pgsql-novice ( at ) postgresql ( dot ) org
    * Subject: SQL for removing duplicates?
    * Date: Tue, 13 Jun 2006 12:11:41 -0400 (EDT)


Hi.  I'm stumped.  I have a large table (about 8.5M records), let's
call it t, whose columns include x and y.  I want to remove records
from this table so that any pair of values for these two fields appear
only once.  (This will get rid of about 15% of the records in t.)

One simple solution would be something like

  CREATE TABLE tmp AS SELECT DISTINCT ON ( x, y ) * FROM t;
  DROP TABLE t;
  ALTER TABLE tmp RENAME TO t;

This works, but it uses a lot of space.  I would prefer to simply cull
the unwanted records from t, but I just can't figure out the SQL for
it.  Any help with it would be *much* appreciated.

Thanks!




On Wed, Sep 06, 2006 at 08:00:54AM -0700, Richard Broersma Jr wrote:
> > I have been trying to open the link and seems that
> > postgresql archive site is down.
> > http://archives.postgresql.org/pgsql-novice/2006-06/msg00092.php
>
> Sorry, the link didn't work.  It worked yesterday when I found it.
>
> I will also forward this to the general list maybe someone there could take a look at it.
>
> Regards,
>
> Richard Broersma Jr.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

--