Home > mailing lists

Re: Duplicated records - Mailing list pgsql-sql

From	lucas@presserv.org
Subject	Re: Duplicated records
Date	May 25, 2005 13:58:22
Msg-id	20050525135807.x9p08co04v8gg0cw@www.presserv.org Whole thread Raw
In response to	Re: [despammed] Duplicated records (Andreas Kretschmer <akretschmer@despammed.com>)
Responses	Re: Duplicated records Re: [despammed] Re: Duplicated records
List	pgsql-sql

Tree view

Hi.
Thanks for the article...
But, I have read it and the query works very slow...
My table have aprox. 180.000 records (correct) and in entire table it has
aprox.360.000 records(duplicated)...
I tried to execute a query to delete the duplicated records, but it 
worked very
very slow... look:

# select * from lanctos order by numos; numos | field1 | field2 | field3 |... 00001 | test   | T2-2   | 2      |...
00001| test   | T2-2   | 2      |... 00002 | Blabla | 0      | ABC    |... 00002 | Blabla | 0      | ABC    |... 00003
|Llllll | Oooooo | Rrrrrr |... 00003 | Llllll | Oooooo | Rrrrrr |...
 
...

The records is entire duplicated (with all fields having the same data),
thinking the "numos" fields as primary key I have executed the query:

# DELETE from lanctos where not oid=(select oid from lanctos as l2 where
l2.numos=lanctos.numos limit 1);

I have tested others querys with EXPLAIN command to examine the performance
time, and this query was the best performance I got... but its is slow. Other
query is:

# DELETE from lanctos where not exists (select '1' from lanctos as l2 where
l2.numos=lanctos.numos and not l2.oid=lanctos.oid);

Is there a way to delete those duplicated records faster??? Remembering the
table have aprox 360.000 records...
Is better I create other table and copy those data??? How should I created???

Thanks.


Quoting Andreas Kretschmer <akretschmer@despammed.com>:
> am  24.05.2005, um 17:59:31 -0300 mailte lucas@presserv.org folgendes:
>> Hi.
>> How can I delete the duplicated records with "DELETE FROM TABLE WHERE..."
>> clause??
>
> Please read http://www.gtsm.com/oscon2003/deletetid.html
>
> Its a very good article about this problem.
>

pgsql-sql by date:

From: "hatuan"
Date: 25 May 2005, 12:27:58
Subject: Sql select like question.

From: Alvaro Herrera
Date: 25 May 2005, 14:45:02
Subject: Re: Duplicated records

Re: Duplicated records - Mailing list pgsql-sql

Previous

Next