Thread: Duplicate rows
I have a postgres db with at least 1000 rows in it. I'd like to know if its possible to detect duplicate values of a column if I don't know which values that are dups? Thanks Andy
Hello. On Fri, 15 May 1998, Andy Lewis wrote: > I have a postgres db with at least 1000 rows in it. I'd like to know if its > possible to detect duplicate values of a column if I don't know which values > that are dups? Try to create UNIQUE index on a problematic column. If creating the unique index failes - there are duplicates (providing that creating the index wouldn't fail by other reasons - disk full, memory shortage, etc.) Oleg. ---- Oleg Broytmann http://members.tripod.com/~phd2/ phd2@earthling.net Programmers don't die, they just GOSUB without RETURN.
Right, I know that there are dups in the column. But, I don't know where they are nor do I know their value(s). I want to be able to find, say, two interger values that are in the same column but, different rows. Thanks Andy On Sat, 16 May 1998, Oleg Broytmann wrote: >Hello. > >On Fri, 15 May 1998, Andy Lewis wrote: >> I have a postgres db with at least 1000 rows in it. I'd like to know if its >> possible to detect duplicate values of a column if I don't know which values >> that are dups? > > Try to create UNIQUE index on a problematic column. If creating the >unique index failes - there are duplicates (providing that creating the >index wouldn't fail by other reasons - disk full, memory shortage, etc.) > >Oleg. >---- > Oleg Broytmann http://members.tripod.com/~phd2/ phd2@earthling.net > Programmers don't die, they just GOSUB without RETURN. > >
Hi! On Sat, 16 May 1998, Andy Lewis wrote: > Right, I know that there are dups in the column. But, I don't know where they > are nor do I know their value(s). I want to be able to find, say, two interger > values that are in the same column but, different rows. It seems that you need a correlated subquery - a loop for every row, that tests whether there are equal values. SELECT oid, mycolumn FROM mytable a WHERE mycolumn IN (SELECT oid, mycolumn FROM mytable b WHERE a.oid <> b.oid) Or may be, join with the same table. Not sure what is better in this situation. SELECT oid, mycolumn FROM mytable a, mytable b WHERE a.oid <> b.oid AND a.mycolumn = b.mycolumn In both cases "a.oid <> b.oid" excludes the same row from comparison (I am pretty sure that in the same row a.mycolumn = b.column :). Oleg. ---- Oleg Broytmann http://members.tripod.com/~phd2/ phd2@earthling.net Programmers don't die, they just GOSUB without RETURN.
Andy Lewis <alewis@mpsi.net> writes: > Right, I know that there are dups in the column. But, I don't know > where they are nor do I know their value(s). I want to be able to > find, say, two interger values that are in the same column but, > different rows. How about SELECT t1.col,t1.oid, t2.oid FROM fred AS t1, fred AS t2 WHERE t1.col=t2.col AND t1.oid<t2.oid (I haven't tested this, but it's the right kind of idea: select the interesting bits from a self-join.)
Thanks, I'll give it a try. Andy On Sat, 16 May 1998, Oleg Broytmann wrote: >Hi! > >On Sat, 16 May 1998, Andy Lewis wrote: >> Right, I know that there are dups in the column. But, I don't know where they >> are nor do I know their value(s). I want to be able to find, say, two interger >> values that are in the same column but, different rows. > > It seems that you need a correlated subquery - a loop for every row, that >tests whether there are equal values. > >SELECT oid, mycolumn FROM mytable a > WHERE mycolumn IN > (SELECT oid, mycolumn FROM mytable b > WHERE a.oid <> b.oid) > >Or may be, join with the same table. Not sure what is better in this >situation. > >SELECT oid, mycolumn FROM mytable a, mytable b > WHERE a.oid <> b.oid AND > a.mycolumn = b.mycolumn > > In both cases "a.oid <> b.oid" excludes the same row from comparison (I >am pretty sure that in the same row a.mycolumn = b.column :). > >Oleg. >---- > Oleg Broytmann http://members..tripod.com/~phd2/ phd2@earthling.net > Programmers don't die, they just GOSUB without RETURN. > >