Thread: could not create unique index, table contains duplicated values
Hello, I have a table with two columns (id, names). At this time there is no index, because i have not needed any one. There are about 120 million rows. Now I want to delete all duplicate rows. I thought I could just create a unique index and postgresql qould do the rest. But ok. Maybe postgresql didn't know which rows to delete. Is there a query that will delete all rows where the same name is more than one time in the table? I only want to have every name once in the table. My method to write a programm, select every row and cvompare it to the others would be very slow. There is something with a having statement, but I don't get the clue to build that statement. Can you help me? Thank you very much, Christian
On Sun, 2005-11-27 at 22:45 +0100, Christian Hofmann wrote: > Hello, > > I have a table with two columns (id, names). > > At this time there is no index, because i have not needed any one. > > There are about 120 million rows. Now I want to delete all duplicate rows. > I thought I could just create a unique index and postgresql qould do the > rest. But ok. Maybe postgresql didn't know which rows to delete. > > Is there a query that will delete all rows where the same name is more than > one time in the table? > I only want to have every name once in the table. > > My method to write a programm, select every row and cvompare it to the > others would be very slow. Do something like this: SELECT DISTINCT * INTO new_table FROM old_table; DROP TABLE old_table; ALTER TABLE new_table RENAME TO old_table; -- Oliver Elphick olly@lfix.co.uk Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/A54310EA 92C8 39E7 280E 3631 3F0E 1EC0 5664 7A2F A543 10EA ======================================== Do you want to know God? http://www.lfix.co.uk/knowing_god.html