Thread: Delete duplicates
Hi, I have a table with duplicates and trouble with my SQL. I'd like to keep a single record and remove older duplicates. For example below of the 6 recods I'd like to keep records 4 and 6. TABLE: aapid | keyword ----+----------------- 1 | LEAGUE PANTHERS 2 | LEAGUE PANTHERS 3 | LEAGUE PANTHERS 4 | LEAGUE PANTHERS 5 | LEAGUE BRONCOS6 | LEAGUE BRONCOS Here is my SQL so far, it will select records 1 to 5 instead of 1,2,3 and 5 only. Any help greatly appreciated. I think I need a Group By somewhere in there. select a1.id from aap a1 where id < ( SELECT max(id) FROM aap AS a2 ) AND EXISTS ( SELECT * FROM aap AS a2 WHERE a1.keyword = a2.keyword ) Regards Rudi.
On Sunday 22 June 2003 11:15, Rudi Starcevic wrote: > Hi, > > I have a table with duplicates and trouble with my SQL. (...) > select a1.id > from aap a1 > where id < ( SELECT max(id) FROM aap AS a2 ) > AND EXISTS > ( > SELECT * > FROM aap AS a2 > WHERE a1.keyword = a2.keyword > ) How about (untested): SELECT a1.id FROM aap a1WHERE id = (SELECT MAX(id) FROM aap a2 WHERE a2.keyword = a1.keyword) Ian Barwick barwick@gmx.net
On 22/06/2003 10:15 Rudi Starcevic wrote: > > > Hi, > > I have a table with duplicates and trouble with my SQL. > I'd like to keep a single record and remove older duplicates. > For example below of the 6 recods I'd like to keep records > 4 and 6. > > TABLE: aap > id | keyword > ----+----------------- > 1 | LEAGUE PANTHERS > 2 | LEAGUE PANTHERS > 3 | LEAGUE PANTHERS > 4 | LEAGUE PANTHERS > 5 | LEAGUE BRONCOS > 6 | LEAGUE BRONCOS > > Here is my SQL so far, it will select records 1 to 5 instead > of 1,2,3 and 5 only. > > Any help greatly appreciated. I think I need a Group By somewhere in > there. > > select a1.id > from aap a1 > where id < ( SELECT max(id) FROM aap AS a2 ) > AND EXISTS > ( > SELECT * > FROM aap AS a2 > WHERE a1.keyword = a2.keyword > ) I just tries this with 7.3.3: select max(id), keyword from aap where keyword in (select distinct keyword from aap) group by keyword; max | keyword ----------------------- 6 | LEAGUE BRONCOS 4 | LEAGUE PANTHERS (2 rows) HTH -- Paul Thomas +------------------------------+---------------------------------------------+ | Thomas Micro Systems Limited | Software Solutions for the Smaller Business | | Computer Consultants | http://www.thomas-micro-systems-ltd.co.uk | +------------------------------+---------------------------------------------+
Hi, you need find duplicates and then you remove them delete from aap where id not in ( select max(id) from aap b where aap.keyword = b.keyword ); Germán Sorry about my english -----Mensaje original----- De: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] En nombre de Rudi Starcevic Enviado el: Domingo, 22 de Junio de 2003 5:15 Para: pgsql-sql@postgresql.org Asunto: [SQL] Delete duplicates Hi, I have a table with duplicates and trouble with my SQL. I'd like to keep a single record and remove older duplicates. For example below of the 6 recods I'd like to keep records 4 and 6. TABLE: aapid | keyword ----+----------------- 1 | LEAGUE PANTHERS 2 | LEAGUE PANTHERS 3 | LEAGUE PANTHERS 4 | LEAGUE PANTHERS 5 | LEAGUE BRONCOS6 | LEAGUE BRONCOS Here is my SQL so far, it will select records 1 to 5 instead of 1,2,3 and 5 only. Any help greatly appreciated. I think I need a Group By somewhere in there. select a1.id from aap a1 where id < ( SELECT max(id) FROM aap AS a2 ) AND EXISTS ( SELECT * FROM aap AS a2 WHERE a1.keyword = a2.keyword ) Regards Rudi. ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
try this DELETE FROM aap WHERE id NOT IN ( SELECT max(id) FROM aap GROUP BY keyword ); > > > Hi, > > I have a table with duplicates and trouble with my SQL. > I'd like to keep a single record and remove older duplicates. > For example below of the 6 recods I'd like to keep records > 4 and 6. > > TABLE: aap > id | keyword > ----+----------------- > 1 | LEAGUE PANTHERS > 2 | LEAGUE PANTHERS > 3 | LEAGUE PANTHERS > 4 | LEAGUE PANTHERS > 5 | LEAGUE BRONCOS > 6 | LEAGUE BRONCOS > > Here is my SQL so far, it will select records 1 to 5 instead > of 1,2,3 and 5 only. > > Any help greatly appreciated. I think I need a Group By somewhere in > there. > > select a1.id > from aap a1 > where id < ( SELECT max(id) FROM aap AS a2 ) > AND EXISTS > ( > SELECT * > FROM aap AS a2 > WHERE a1.keyword = a2.keyword > ) > > Regards > Rudi. > > ---------------------------(end of > broadcast)--------------------------- TIP 1: subscribe and unsubscribe > commands go to majordomo@postgresql.org
How to delete "real" duplicates? id | somthing ----------------------- 1 | aaa 1 | aaa 2 | bbb 2 | bbb (an accident with backup recovery...) Regards, Denis Arh ----- Original Message ----- From: "Franco Bruno Borghesi" <franco@akyasociados.com.ar> To: <rudi@oasis.net.au> Cc: <pgsql-sql@postgresql.org> Sent: Sunday, June 22, 2003 11:17 PM Subject: Re: [SQL] Delete duplicates > try this > > DELETE FROM aap WHERE id NOT IN ( > SELECT max(id) > FROM aap > GROUP BY keyword > ); > > > > > > > Hi, > > > > I have a table with duplicates and trouble with my SQL. > > I'd like to keep a single record and remove older duplicates. > > For example below of the 6 recods I'd like to keep records > > 4 and 6. > > > > TABLE: aap > > id | keyword > > ----+----------------- > > 1 | LEAGUE PANTHERS > > 2 | LEAGUE PANTHERS > > 3 | LEAGUE PANTHERS > > 4 | LEAGUE PANTHERS > > 5 | LEAGUE BRONCOS > > 6 | LEAGUE BRONCOS > > > > Here is my SQL so far, it will select records 1 to 5 instead > > of 1,2,3 and 5 only. > > > > Any help greatly appreciated. I think I need a Group By somewhere in > > there. > > > > select a1.id > > from aap a1 > > where id < ( SELECT max(id) FROM aap AS a2 ) > > AND EXISTS > > ( > > SELECT * > > FROM aap AS a2 > > WHERE a1.keyword = a2.keyword > > ) > > > > Regards > > Rudi. > > > > ---------------------------(end of > > broadcast)--------------------------- TIP 1: subscribe and unsubscribe > > commands go to majordomo@postgresql.org > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > >
> How to delete "real" duplicates? > > id | somthing > ----------------------- > 1 | aaa > 1 | aaa > 2 | bbb > 2 | bbb > > (an accident with backup recovery...) I'm not 100% on some of the syntax off the top of my head, but: BEGIN; ALTER TABLE orig_table RENAME TO backup_table; CREATE TABLE orig_table AS SELECT id,something FROM backup_table GROUP BY id, something; -- Create any indexes on orig_table that need to be recreated DROP TABLE orig_table; COMMIT; This isn't for the faint of heart: be sure to do this inside of a transaction or on a backup db until you're 100% good to go. -sc -- Sean Chittenden
"Denis Arh" <denis@exonium.net> writes: > How to delete "real" duplicates? Use the OID or CTID system columns. regards, tom lane
Hi, Would this be OK or a little crude (untested) : INSERT INTO new_table ( id, something ) SELECT DISTINCT ON (id) id, something FROM old_table ORDER BY id Or something similar but create a new table ? Cheers Rudi. Denis Arh wrote: >How to delete "real" duplicates? > >id | somthing >----------------------- >1 | aaa >1 | aaa >2 | bbb >2 | bbb > >(an accident with backup recovery...) > > > >Regards, >Denis Arh > >
Denis Arh wrote: >How to delete "real" duplicates? > >id | somthing >----------------------- >1 | aaa >1 | aaa >2 | bbb >2 | bbb > >(an accident with backup recovery...) > In these cases, its certainly the best to rebuild your table using a CREATE TABLE new AS SELECT col1,col1.. FROM old GROUPY BY col1,col2... If you have got many duplicates, recreating a new table semms better to me than deleteing. (Another advantage is that you still have the "bad" table as a record). Cheers, Dani