Thread: contracting tables
Hi-- I have a table with many records, some of which are duplicates (there is no unique constraints). How can I contract this table to remove any of these duplicate records? Like when using GROUP BY in a select statement, except that I want to just remove the extra entries from this table directly... Any ideas? Thanks, Peter Brown
On Fri, 2001-11-30 at 00:50, Peter T. Brown wrote: > I have a table with many records, some of which are duplicates (there is no > unique constraints). How can I contract this table to remove any of these > duplicate records? Like when using GROUP BY in a select statement, except > that I want to just remove the extra entries from this table directly... Any > ideas? If I understood you right, one problem is to identify only one row of a duplicate. There is a 'hidden' field oid in every table that is unique. It goes like this: select oid, * from <yourtable> So you could delete the rows by oid: delete from <yourtable> where oid = <oid> HTH Markus Bertheau
This is complicated (and I haven't tested it) but it might work: delete from <yourtable> where oid in (select (select oid from <yourtable> where <field>=yt.<field>) from <yourtable> yt group by <field>) Pretty crazy - don't know if it will work! Chris > -----Original Message----- > From: pgsql-sql-owner@postgresql.org > [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Markus Bertheau > Sent: Friday, 30 November 2001 3:53 PM > To: Peter T. Brown > Cc: postgres sql list > Subject: Re: [SQL] contracting tables > > > On Fri, 2001-11-30 at 00:50, Peter T. Brown wrote: > > I have a table with many records, some of which are duplicates > (there is no > > unique constraints). How can I contract this table to remove > any of these > > duplicate records? Like when using GROUP BY in a select > statement, except > > that I want to just remove the extra entries from this table > directly... Any > > ideas? > > If I understood you right, one problem is to identify only one row of a > duplicate. There is a 'hidden' field oid in every table that is unique. > It goes like this: > > select oid, * from <yourtable> > > So you could delete the rows by oid: > > delete from <yourtable> where oid = <oid> > > HTH > > Markus Bertheau > > >
Dear all, Any available solution to port CONNECT BY of oracle to postgresql ? M.T. __________________________________________________ Do You Yahoo!? Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month. http://geocities.yahoo.com/ps/info1
If you don't have a lot of indexes, sequences, referential integrity constraints etc. the easiest way would be: SELECT INTO table2 DISTINCT * FROM table1; DROP table1; ALTER TABLE table2 RENAME TO table1; Then recreate your other objects/constraints. If you want to do it in place, then: DELETE FROM table1 WHERE EXISTS ( SELECT * FROM table1 AS t1 WHERE t1.key < table1.key ); You will need an index on your "key" value, or this will take a long time on a large table. --- "Peter T. Brown" <peter@memeticsystems.com> wrote: > Hi-- > > I have a table with many records, some of which are > duplicates (there is no > unique constraints). How can I contract this table > to remove any of these > duplicate records? Like when using GROUP BY in a > select statement, except > that I want to just remove the extra entries from > this table directly... Any > ideas? > > > Thanks, > > Peter Brown > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please > send an appropriate > subscribe-nomail command to majordomo@postgresql.org > so that your > message can get through to the mailing list cleanly __________________________________________________ Do You Yahoo!? Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month. http://geocities.yahoo.com/ps/info1
Thinking about that some more: SELECT INTO table2 DISTINCT * FROM table1; TRUNCATE table1; INSERT INTO table1 SELECT * FROM table2; DROP TABLE2; may be easiest, if you want to preserve table1, and need distinct row values. --- "Peter T. Brown" <peter@memeticsystems.com> wrote: > Hi-- > > I have a table with many records, some of which are > duplicates (there is no > unique constraints). How can I contract this table > to remove any of these > duplicate records? Like when using GROUP BY in a > select statement, except > that I want to just remove the extra entries from > this table directly... Any > ideas? > > > Thanks, > > Peter Brown > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please > send an appropriate > subscribe-nomail command to majordomo@postgresql.org > so that your > message can get through to the mailing list cleanly __________________________________________________ Do You Yahoo!? Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month. http://geocities.yahoo.com/ps/info1
On Fri, Nov 30, 2001 at 02:20:51AM -0800, MindTerm wrote: > Dear all, > > Any available solution to port CONNECT BY of oracle > to postgresql ? Several. There are some in the PostgreSQL Cookbook (http://www.brasileiro.net/postgres) and Dan Wickstrom (from OpenACS.org) came up with one for OpenACS: http://openacs.org/new-file-storage/one-file?file_id=123 -Roberto -- +----| http://fslc.usu.edu USU Free Software & GNU/Linux Club |------+ Roberto Mello - Computer Science, USU - http://www.brasileiro.net http://www.sdl.usu.edu - Space Dynamics Lab, Developer I'm Not Schizophrenic, And Neither Am I.
Thanks for the reply everyone! I cannot seem to find the EXISTS keyword anywhere in the postgresql documentation.. What does that do? And, generally, do you guys think it better/more efficient/etc to select into, rename, drop tables OR to do the complex single SQL to operate on the table directly? Thanks -----Original Message----- From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Jeff Eckermann Sent: Friday, November 30, 2001 7:39 AM To: Peter T. Brown; pgsql-sql@postgresql.org Subject: Re: [SQL] contracting tables If you don't have a lot of indexes, sequences, referential integrity constraints etc. the easiest way would be: SELECT INTO table2 DISTINCT * FROM table1; DROP table1; ALTER TABLE table2 RENAME TO table1; Then recreate your other objects/constraints. If you want to do it in place, then: DELETE FROM table1 WHERE EXISTS ( SELECT * FROM table1 AS t1 WHERE t1.key < table1.key ); You will need an index on your "key" value, or this will take a long time on a large table. --- "Peter T. Brown" <peter@memeticsystems.com> wrote: > Hi-- > > I have a table with many records, some of which are > duplicates (there is no > unique constraints). How can I contract this table > to remove any of these > duplicate records? Like when using GROUP BY in a > select statement, except > that I want to just remove the extra entries from > this table directly... Any > ideas? > > > Thanks, > > Peter Brown > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please > send an appropriate > subscribe-nomail command to majordomo@postgresql.org > so that your > message can get through to the mailing list cleanly __________________________________________________ Do You Yahoo!? Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month. http://geocities.yahoo.com/ps/info1 ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
"Peter T. Brown" <peter@memeticsystems.com> writes: > I cannot seem to find the EXISTS keyword anywhere in the postgresql > documentation. Er ... um ... well ... there isn't any doco about EXISTS ... nor related constructs such as IN. I've attempted to rectify this in the 7.2-to-be docs, see http://candle.pha.pa.us/main/writings/pgsql/sgml/functions-subquery.html It's a tad on the dry-and-technical side, for sure. You might prefer to consult an introductory SQL book or three. regards, tom lane
On Sat, Dec 01, 2001 at 12:21:16AM -0500, Tom Lane wrote: > > Er ... um ... well ... there isn't any doco about EXISTS ... nor > related constructs such as IN. Is there a documentation TODO somewhere? I'm sure people would step up to write/rectify more docs if they knew what needs to be written (myself included). -Roberto -- +----| http://fslc.usu.edu USU Free Software & GNU/Linux Club |------+ Roberto Mello - Computer Science, USU - http://www.brasileiro.net http://www.sdl.usu.edu - Space Dynamics Lab, Developer Je n'est suis pas tr�s content avec cette classe de Calculus.
>>>>> "Jeff" == Jeff Eckermann <jeff_eckermann@yahoo.com> writes: Jeff> If you don't have a lot of indexes, sequences, Jeff> referential integrity constraints etc. the easiest way Jeff> would be: Jeff> SELECT INTO table2 DISTINCT * FROM table1; Jeff> DROP table1; Jeff> ALTER TABLE table2 RENAME TO table1; Jeff> Then recreate your other objects/constraints. Jeff> If you want to do it in place, then: Jeff> DELETE FROM table1 Jeff> WHERE EXISTS ( Jeff> SELECT * FROM table1 AS t1 Jeff> WHERE t1.key < table1.key Jeff> ); Jeff> You will need an index on your "key" value, or this Jeff> will take a long time on a large table. Or maybe something like: DELETE FROM table1 WHERE oid NOT IN (SELECT min(oid) FROM table1 GROUP BY key1, key2) Untested, and I might be a little fuzzy on the syntax. This keeps the lowest oid row for the given key1/key2 pair. Change that to * to remove duplicates across all columns. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!