Thread: Re[2]: [SQL] Query to eliminate duplicates

Re[2]: [SQL] Query to eliminate duplicates

From
Sferacarta Software
Date:
Hello Mario,

giovedì, 3 dicembre 98, you wrote:


MF> On 02-Dec-98 Sferacarta Software wrote:
>> MF> Is there a way to eliminate duplicate records using just SQL?
>>
>> SELECT DISTINCT ...

MF>         I deserved that!

MF>         The problem is that i have a table where there are duplicate records
MF> and i want to delete them! I tryed creating a unique index but it told me it
MF> couldn't because it didn't have a function to do something (I think it is
MF> because i had a boolean field in it)


select * from TABELA where COLUNA in (
     select COLUNA from TABELA group by COLUNA having 1 < count(COLUNA)
     );

This query should retrieve all rows where COLUNA field contains
duplicate values.
Unfortunately, seems it doesn't work on v6.4, but you may
do: select COLUNA from TABELA group by COLUNA having 1 < count(COLUNA)
to display only COLUNA field.

EXAMPLE:

select * from emp;
ename |empno|job       |  hiredate|sal      |comm|deptno|level| mgr
------+-----+----------+----------+---------+----+------+-----+----
ALLEN | 7499|SALESMAN  |1981-02-20|$1,600.00| 300|    20|    4|7782
BLAKE | 7698|MANAGER   |1981-05-01|$2,850.00|    |    30|    3|7782
JONES | 7900|CLERK     |1981-12-03|$950.00  |    |    30|    2|7782
MILLER| 7654|SALESMAN  |1981-09-28|$1,250.00| 400|    30|    3|7839
CLARK | 7844|SALESMAN  |1981-09-08|$1,500.00|    |    10|    2|7839
KING  | 7521|SALESMAN  |1981-02-22|$1,250.00| 500|    10|    1|7782
ALLEN | 7499|SALESMAN  |1981-02-20|$1,600.00| 300|    20|    4|7782
BLAKE | 7698|MANAGER   |1981-05-01|$2,850.00|    |    30|    3|7782
(8 rows)

select ename from emp group by ename having 1 < count(ename);
ename
-----
ALLEN
BLAKE
(2 rows)


-Jose'-



RE: Re[2]: [SQL] Query to eliminate duplicates

From
Mario Filipe
Date:
On 03-Dec-98 Sferacarta Software wrote:
> Hello Mario,
>
> giovedì, 3 dicembre 98, you wrote:
>
>
> MF> On 02-Dec-98 Sferacarta Software wrote:
>>> MF> Is there a way to eliminate duplicate records using just SQL?
>>>
>>> SELECT DISTINCT ...
>
> MF>         I deserved that!
>
> MF>         The problem is that i have a table where there are duplicate
> records
> MF> and i want to delete them! I tryed creating a unique index but it told me
> it
> MF> couldn't because it didn't have a function to do something (I think it is
> MF> because i had a boolean field in it)
>
>
> select * from TABELA where COLUNA in (
>      select COLUNA from TABELA group by COLUNA having 1 < count(COLUNA)
>      );
>
        This would work (i believe you) but in my case i'm looking into a table
where there are at least 3 columns to look at!

        It's a table where I keep the information about wich are the
disciplines where a student is enrolled:

        Ex:

numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
------------+-----------------+-----------+-----+--------
       10335|             1207|1998/1999  |    1|f
       10335|             1208|1998/1999  |    1|f
       10335|             1209|1998/1999  |    1|f
       10335|             1203|1998/1999  |    1|f
       10335|             1205|1998/1999  |    1|f
       10335|             1212|1998/1999  |    1|f
       10335|             1213|1998/1999  |    1|f
       10335|             1215|1998/1999  |    1|f


In this case there are no duplicates but in this one there are:

numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
------------+-----------------+-----------+-----+--------
       13427|               35|1998/1999  |    1|t
       13427|              904|1998/1999  |    1|f
       13427|              122|1998/1999  |    1|t
       13427|              907|1998/1999  |    1|f
       13427|              481|1998/1999  |    1|f
       13427|              286|1998/1999  |    1|t
       13427|              368|1998/1999  |    1|t
       13427|               35|1998/1999  |    1|t
       13427|              904|1998/1999  |    1|f
       13427|              122|1998/1999  |    1|t


So a duplicate is actually a record that looks exactly like other record on the
table...

        Thanks for your help anyway

        Mario Filipe
        mjnf@uevora.pt
        http://neptuno.sc.uevora.pt/~mjnf

Attachment

Re: Re[2]: [SQL] Query to eliminate duplicates

From
Leslie Mikesell
Date:
According to Mario Filipe:
>
> >>> MF> Is there a way to eliminate duplicate records using just SQL?
> >>>
> >>> SELECT DISTINCT ...
> >
> >
> > MF>         The problem is that i have a table where there are duplicate
> > records
> > MF> and i want to delete them! I tryed creating a unique index but it told me
> > it
> > MF> couldn't because it didn't have a function to do something (I think it is
> > MF> because i had a boolean field in it)
>
> So a duplicate is actually a record that looks exactly like other record on the
> table...

How about:

select distinct * into newtable from oldtable ;
drop table oldtable;
alter table newtable rename to oldtable;

You have to drop/create indexes separately, even the ones built
implicitly from 'primary key' directives.

   Les Mikesell
    les@mcs.com

Re[4]: [SQL] Query to eliminate duplicates

From
Sferacarta Software
Date:
Hello Mario,

giovedì, 3 dicembre 98, you wrote:


MF> On 03-Dec-98 Sferacarta Software wrote:
>> Hello Mario,
>>
>> giovedì, 3 dicembre 98, you wrote:
>>
>>
>> MF> On 02-Dec-98 Sferacarta Software wrote:
>>>> MF> Is there a way to eliminate duplicate records using just SQL?
>>>>
>>>> SELECT DISTINCT ...
>>
>> MF>         I deserved that!
>>
>> MF>         The problem is that i have a table where there are duplicate
>> records
>> MF> and i want to delete them! I tryed creating a unique index but it told me
>> it
>> MF> couldn't because it didn't have a function to do something (I think it is
>> MF> because i had a boolean field in it)
>>
>>
>> select * from TABELA where COLUNA in (
>>      select COLUNA from TABELA group by COLUNA having 1 < count(COLUNA)
>>      );
>>
MF>         This would work (i believe you) but in my case i'm looking into a table
MF> where there are at least 3 columns to look at!

MF>         It's a table where I keep the information about wich are the
MF> disciplines where a student is enrolled:

MF>         Ex:

MF> numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
MF> ------------+-----------------+-----------+-----+--------
MF>        10335|             1207|1998/1999  |    1|f
MF>        10335|             1208|1998/1999  |    1|f
MF>        10335|             1209|1998/1999  |    1|f
MF>        10335|             1203|1998/1999  |    1|f
MF>        10335|             1205|1998/1999  |    1|f
MF>        10335|             1212|1998/1999  |    1|f
MF>        10335|             1213|1998/1999  |    1|f
MF>        10335|             1215|1998/1999  |    1|f


MF> In this case there are no duplicates but in this one there are:

MF> numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
MF> ------------+-----------------+-----------+-----+--------
MF>        13427|               35|1998/1999  |    1|t
MF>        13427|              904|1998/1999  |    1|f
MF>        13427|              122|1998/1999  |    1|t
MF>        13427|              907|1998/1999  |    1|f
MF>        13427|              481|1998/1999  |    1|f
MF>        13427|              286|1998/1999  |    1|t
MF>        13427|              368|1998/1999  |    1|t
MF>        13427|               35|1998/1999  |    1|t
MF>        13427|              904|1998/1999  |    1|f
MF>        13427|              122|1998/1999  |    1|t


MF> So a duplicate is actually a record that looks exactly like other record on the
MF> table...

MF>         Thanks for your help anyway

MF>         Mario Filipe
MF>         mjnf@uevora.pt
MF>         http://neptuno.sc.uevora.pt/~mjnf


select * from cursos;
numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
------------+-----------------+-----------+-----+--------
       13427|               35|1998/1999  |    1|t
       13427|              904|1998/1999  |    1|f
       13427|              122|1998/1999  |    1|t
       13427|              907|1998/1999  |    1|f
       13427|              481|1998/1999  |    1|f
       13427|              286|1998/1999  |    1|t
       13427|              368|1998/1999  |    1|t
       13427|               35|1998/1999  |    1|t
       13427|              904|1998/1999  |    1|f
       13427|              122|1998/1999  |    1|t
(10 rows)

create table temp as select distinct * from cursos;
SELECT
select * from temp;
numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
------------+-----------------+-----------+-----+--------
       13427|               35|1998/1999  |    1|t
       13427|              122|1998/1999  |    1|t
       13427|              286|1998/1999  |    1|t
       13427|              368|1998/1999  |    1|t
       13427|              481|1998/1999  |    1|f
       13427|              904|1998/1999  |    1|f
       13427|              907|1998/1999  |    1|f
(7 rows)

DROP TABLE cursos;
DROP
alter table temp rename to cursos;
RENAME

-Jose'-



RE: Re[4]: [SQL] Query to eliminate duplicates

From
Mario Filipe
Date:
> select * from cursos;
> numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
> ------------+-----------------+-----------+-----+--------
>        13427|               35|1998/1999  |    1|t
>        13427|              904|1998/1999  |    1|f
>        13427|              122|1998/1999  |    1|t
>        13427|              907|1998/1999  |    1|f
>        13427|              481|1998/1999  |    1|f
>        13427|              286|1998/1999  |    1|t
>        13427|              368|1998/1999  |    1|t
>        13427|               35|1998/1999  |    1|t
>        13427|              904|1998/1999  |    1|f
>        13427|              122|1998/1999  |    1|t
> (10 rows)
>
> create table temp as select distinct * from cursos;
> SELECT
> select * from temp;
> numero_aluno|codigo_disciplina|ano_lectivo|epoca|semestre
> ------------+-----------------+-----------+-----+--------
>        13427|               35|1998/1999  |    1|t
>        13427|              122|1998/1999  |    1|t
>        13427|              286|1998/1999  |    1|t
>        13427|              368|1998/1999  |    1|t
>        13427|              481|1998/1999  |    1|f
>        13427|              904|1998/1999  |    1|f
>        13427|              907|1998/1999  |    1|f
> (7 rows)
>
> DROP TABLE cursos;
> DROP
> alter table temp rename to cursos;
> RENAME

        Thanks! I had thought about this a while ago... but then I forgott it!


        Mario Filipe
        mjnf@uevora.pt
        http://neptuno.sc.uevora.pt/~mjnf (modificada em 26/11/98)

Attachment