Thread: contracting tables

contracting tables

From

"Peter T. Brown"

Date:

29 November 2001, 19:04:28

Hi--

I have a table with many records, some of which are duplicates (there is no
unique constraints). How can I contract this table to remove any of these
duplicate records? Like when using GROUP BY in a select statement, except
that I want to just remove the extra entries from this table directly... Any
ideas?


Thanks,

Peter Brown

Re: contracting tables

From

Markus Bertheau

Date:

30 November 2001, 03:04:42

On Fri, 2001-11-30 at 00:50, Peter T. Brown wrote:
> I have a table with many records, some of which are duplicates (there is no
> unique constraints). How can I contract this table to remove any of these
> duplicate records? Like when using GROUP BY in a select statement, except
> that I want to just remove the extra entries from this table directly... Any
> ideas?

If I understood you right, one problem is to identify only one row of a
duplicate. There is a 'hidden' field oid in every table that is unique.
It goes like this:

select oid, * from <yourtable>

So you could delete the rows by oid:

delete from <yourtable> where oid = <oid>

HTH

Markus Bertheau

Re: contracting tables

From

"Christopher Kings-Lynne"

Date:

30 November 2001, 04:04:43

This is complicated (and I haven't tested it) but it might work:

delete from <yourtable> where oid in (select (select oid from <yourtable>
where <field>=yt.<field>) from <yourtable> yt group by <field>)

Pretty crazy - don't know if it will work!

Chris

> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Markus Bertheau
> Sent: Friday, 30 November 2001 3:53 PM
> To: Peter T. Brown
> Cc: postgres sql list
> Subject: Re: [SQL] contracting tables
>
>
> On Fri, 2001-11-30 at 00:50, Peter T. Brown wrote:
> > I have a table with many records, some of which are duplicates
> (there is no
> > unique constraints). How can I contract this table to remove
> any of these
> > duplicate records? Like when using GROUP BY in a select
> statement, except
> > that I want to just remove the extra entries from this table
> directly... Any
> > ideas?
>
> If I understood you right, one problem is to identify only one row of a
> duplicate. There is a 'hidden' field oid in every table that is unique.
> It goes like this:
>
> select oid, * from <yourtable>
>
> So you could delete the rows by oid:
>
> delete from <yourtable> where oid = <oid>
>
> HTH
>
> Markus Bertheau
>
>
>

Any available solution to port CONNECT BY of oracle to postgresql

From

MindTerm

Date:

30 November 2001, 05:25:26

Dear all,
 Any available solution to port CONNECT BY of oracle
to postgresql ?

M.T.


__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1

Re: contracting tables

From

Jeff Eckermann

Date:

30 November 2001, 10:55:30

If you don't have a lot of indexes, sequences,
referential integrity constraints etc. the easiest way
would be:
SELECT INTO table2 DISTINCT * FROM table1;
DROP table1;
ALTER TABLE table2 RENAME TO table1;
Then recreate your other objects/constraints.

If you want to do it in place, then:
DELETE FROM table1 
WHERE EXISTS (    SELECT * FROM table1 AS t1     WHERE t1.key < table1.key
);
You will need an index on your "key" value, or this
will take a long time on a large table.
--- "Peter T. Brown" <peter@memeticsystems.com> wrote:
> Hi--
> 
> I have a table with many records, some of which are
> duplicates (there is no
> unique constraints). How can I contract this table
> to remove any of these
> duplicate records? Like when using GROUP BY in a
> select statement, except
> that I want to just remove the extra entries from
> this table directly... Any
> ideas?
> 
> 
> Thanks,
> 
> Peter Brown
> 
> 
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please
> send an appropriate
> subscribe-nomail command to majordomo@postgresql.org
> so that your
> message can get through to the mailing list cleanly

__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1

Re: contracting tables

From

Jeff Eckermann

Date:

30 November 2001, 10:55:31

Thinking about that some more:

SELECT INTO table2 DISTINCT * FROM table1;
TRUNCATE table1;
INSERT INTO table1 SELECT * FROM table2;
DROP TABLE2;

may be easiest, if you want to preserve table1, and
need distinct row values.

--- "Peter T. Brown" <peter@memeticsystems.com> wrote:
> Hi--
> 
> I have a table with many records, some of which are
> duplicates (there is no
> unique constraints). How can I contract this table
> to remove any of these
> duplicate records? Like when using GROUP BY in a
> select statement, except
> that I want to just remove the extra entries from
> this table directly... Any
> ideas?
> 
> 
> Thanks,
> 
> Peter Brown
> 
> 
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please
> send an appropriate
> subscribe-nomail command to majordomo@postgresql.org
> so that your
> message can get through to the mailing list cleanly


__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1

Re: Any available solution to port CONNECT BY of oracle to postgresql

From

Roberto Mello

Date:

30 November 2001, 12:34:39

On Fri, Nov 30, 2001 at 02:20:51AM -0800, MindTerm wrote:
> Dear all,
> 
>   Any available solution to port CONNECT BY of oracle
> to postgresql ?

Several.

There are some in the PostgreSQL Cookbook
(http://www.brasileiro.net/postgres) and Dan Wickstrom (from OpenACS.org)
came up with one for OpenACS:

http://openacs.org/new-file-storage/one-file?file_id=123

-Roberto

-- 
+----| http://fslc.usu.edu USU Free Software & GNU/Linux Club |------+ Roberto Mello - Computer Science, USU -
http://www.brasileiro.net      http://www.sdl.usu.edu - Space Dynamics Lab, Developer    
 
I'm Not Schizophrenic, And Neither Am I.

Re: contracting tables

From

"Peter T. Brown"

Date:

30 November 2001, 13:03:25

Thanks for the reply everyone!

I cannot seem to find the EXISTS keyword anywhere in the postgresql
documentation.. What does that do?

And, generally, do you guys think it better/more efficient/etc to select
into, rename, drop tables OR to do the complex single SQL to operate on the
table directly?

Thanks

-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org]On Behalf Of Jeff Eckermann
Sent: Friday, November 30, 2001 7:39 AM
To: Peter T. Brown; pgsql-sql@postgresql.org
Subject: Re: [SQL] contracting tables

If you don't have a lot of indexes, sequences,
referential integrity constraints etc. the easiest way
would be:
SELECT INTO table2 DISTINCT * FROM table1;
DROP table1;
ALTER TABLE table2 RENAME TO table1;
Then recreate your other objects/constraints.

If you want to do it in place, then:
DELETE FROM table1
WHERE EXISTS (    SELECT * FROM table1 AS t1    WHERE t1.key < table1.key
);
You will need an index on your "key" value, or this
will take a long time on a large table.
--- "Peter T. Brown" <peter@memeticsystems.com> wrote:
> Hi--
>
> I have a table with many records, some of which are
> duplicates (there is no
> unique constraints). How can I contract this table
> to remove any of these
> duplicate records? Like when using GROUP BY in a
> select statement, except
> that I want to just remove the extra entries from
> this table directly... Any
> ideas?
>
>
> Thanks,
>
> Peter Brown
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please
> send an appropriate
> subscribe-nomail command to majordomo@postgresql.org
> so that your
> message can get through to the mailing list cleanly

__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html

Re: contracting tables

From

Tom Lane

Date:

01 December 2001, 00:25:37

"Peter T. Brown" <peter@memeticsystems.com> writes:
> I cannot seem to find the EXISTS keyword anywhere in the postgresql
> documentation.

Er ... um ... well ... there isn't any doco about EXISTS ... nor
related constructs such as IN.

I've attempted to rectify this in the 7.2-to-be docs, see
http://candle.pha.pa.us/main/writings/pgsql/sgml/functions-subquery.html
It's a tad on the dry-and-technical side, for sure.  You might prefer
to consult an introductory SQL book or three.
        regards, tom lane

Re: contracting tables

From

Roberto Mello

Date:

02 December 2001, 17:35:21

On Sat, Dec 01, 2001 at 12:21:16AM -0500, Tom Lane wrote:
> 
> Er ... um ... well ... there isn't any doco about EXISTS ... nor
> related constructs such as IN.

Is there a documentation TODO somewhere? I'm sure people would step up to
write/rectify more docs if they knew what needs to be written (myself
included).

-Roberto
-- 
+----| http://fslc.usu.edu USU Free Software & GNU/Linux Club |------+ Roberto Mello - Computer Science, USU -
http://www.brasileiro.net      http://www.sdl.usu.edu - Space Dynamics Lab, Developer    

Je n'est suis pas tr�s content avec cette classe de Calculus.

Re: contracting tables

From

merlyn@stonehenge.com (Randal L. Schwartz)

Date:

03 December 2001, 13:49:08

>>>>> "Jeff" == Jeff Eckermann <jeff_eckermann@yahoo.com> writes:

Jeff> If you don't have a lot of indexes, sequences,
Jeff> referential integrity constraints etc. the easiest way
Jeff> would be:
Jeff> SELECT INTO table2 DISTINCT * FROM table1;
Jeff> DROP table1;
Jeff> ALTER TABLE table2 RENAME TO table1;
Jeff> Then recreate your other objects/constraints.

Jeff> If you want to do it in place, then:
Jeff> DELETE FROM table1 
Jeff> WHERE EXISTS (
Jeff>      SELECT * FROM table1 AS t1 
Jeff>      WHERE t1.key < table1.key
Jeff> );
Jeff> You will need an index on your "key" value, or this
Jeff> will take a long time on a large table.

Or maybe something like:

DELETE FROM table1
WHERE oid NOT IN (SELECT min(oid) FROM table1 GROUP BY key1, key2)

Untested, and I might be a little fuzzy on the syntax.  This keeps
the lowest oid row for the given key1/key2 pair.  Change that to *
to remove duplicates across all columns.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!