Thread: Duplicate rows

Duplicate rows

From
Andy Lewis
Date:
I have a postgres db with at least 1000 rows in it. I'd like to know if its
possible to detect duplicate values of a column if I don't know which values
that are dups?

Thanks

Andy


Re: [SQL] Duplicate rows

From
Oleg Broytmann
Date:
Hello.

On Fri, 15 May 1998, Andy Lewis wrote:
> I have a postgres db with at least 1000 rows in it. I'd like to know if its
> possible to detect duplicate values of a column if I don't know which values
> that are dups?

   Try to create UNIQUE index on a problematic column. If creating the
unique index failes - there are duplicates (providing that creating the
index wouldn't fail by other reasons - disk full, memory shortage, etc.)

Oleg.
----
  Oleg Broytmann     http://members.tripod.com/~phd2/     phd2@earthling.net
           Programmers don't die, they just GOSUB without RETURN.


Re: [SQL] Duplicate rows

From
Andy Lewis
Date:
Right, I know that there are dups in the column. But, I don't know where they
are nor do I know their value(s). I want to be able to find, say, two interger
values that are in the same column but, different rows.

Thanks

Andy

On Sat, 16 May 1998, Oleg Broytmann wrote:

>Hello.
>
>On Fri, 15 May 1998, Andy Lewis wrote:
>> I have a postgres db with at least 1000 rows in it. I'd like to know if its
>> possible to detect duplicate values of a column if I don't know which values
>> that are dups?
>
>   Try to create UNIQUE index on a problematic column. If creating the
>unique index failes - there are duplicates (providing that creating the
>index wouldn't fail by other reasons - disk full, memory shortage, etc.)
>
>Oleg.
>----
>  Oleg Broytmann     http://members.tripod.com/~phd2/     phd2@earthling.net
>           Programmers don't die, they just GOSUB without RETURN.
>
>


Re: [SQL] Duplicate rows

From
Oleg Broytmann
Date:
Hi!

On Sat, 16 May 1998, Andy Lewis wrote:
> Right, I know that there are dups in the column. But, I don't know where they
> are nor do I know their value(s). I want to be able to find, say, two interger
> values that are in the same column but, different rows.

   It seems that you need a correlated subquery - a loop for every row, that
tests whether there are equal values.

SELECT oid, mycolumn FROM mytable a
   WHERE mycolumn IN
      (SELECT oid, mycolumn FROM mytable b
         WHERE a.oid <> b.oid)

Or may be, join with the same table. Not sure what is better in this
situation.

SELECT oid, mycolumn FROM mytable a, mytable b
   WHERE a.oid <> b.oid AND
      a.mycolumn = b.mycolumn

   In both cases "a.oid <> b.oid" excludes the same row from comparison (I
am pretty sure that in the same row a.mycolumn = b.column :).

Oleg.
----
  Oleg Broytmann     http://members.tripod.com/~phd2/     phd2@earthling.net
           Programmers don't die, they just GOSUB without RETURN.


Re: [SQL] Duplicate rows

From
Bruce Stephens
Date:
Andy Lewis <alewis@mpsi.net> writes:

> Right, I know that there are dups in the column. But, I don't know
> where they are nor do I know their value(s). I want to be able to
> find, say, two interger values that are in the same column but,
> different rows.

How about

SELECT t1.col,t1.oid, t2.oid FROM fred AS t1, fred AS t2
    WHERE t1.col=t2.col AND t1.oid<t2.oid

(I haven't tested this, but it's the right kind of idea: select the
interesting bits from a self-join.)

Re: [SQL] Duplicate rows

From
Andy Lewis
Date:
Thanks, I'll give it a try.

Andy

On Sat, 16 May 1998, Oleg Broytmann wrote:

>Hi!
>
>On Sat, 16 May 1998, Andy Lewis wrote:
>> Right, I know that there are dups in the column. But, I don't know where they
>> are nor do I know their value(s). I want to be able to find, say, two interger
>> values that are in the same column but, different rows.
>
>   It seems that you need a correlated subquery - a loop for every row, that
>tests whether there are equal values.
>
>SELECT oid, mycolumn FROM mytable a
>   WHERE mycolumn IN
>      (SELECT oid, mycolumn FROM mytable b
>         WHERE a.oid <> b.oid)
>
>Or may be, join with the same table. Not sure what is better in this
>situation.
>
>SELECT oid, mycolumn FROM mytable a, mytable b
>   WHERE a.oid <> b.oid AND
>      a.mycolumn = b.mycolumn
>
>   In both cases "a.oid <> b.oid" excludes the same row from comparison (I
>am pretty sure that in the same row a.mycolumn = b.column :).
>
>Oleg.
>----
>  Oleg Broytmann     http://members..tripod.com/~phd2/     phd2@earthling.net
>           Programmers don't die, they just GOSUB without RETURN.
>
>