Home > mailing lists

Re: identifying duplicates in table with redundancies - Mailing list pgsql-sql

From	Oliveiros d'Azevedo Cristina
Subject	Re: identifying duplicates in table with redundancies
Date	September 28, 2010 07:34:46
Msg-id	5F1C28C4D61848D6A012C5DD84B22226@marktestcr.marktest.pt Whole thread Raw
In response to	identifying duplicates in table with redundancies (Tarlika Elisabeth Schmitz <postgresql@numerixtechnology.de>)
Responses	Re: identifying duplicates in table with redundancies
List	pgsql-sql

Tree view

Hello again,
Tarlika.

In what concerns to indices, I 'm affraid I may not be the best person to 
advise you, my knowledge of them hardly goes beyond the most trivial cases.

I'm sure there are plenty of other people on the list who are able to give 
you better advise than me.
But, on this query in particular I would recomend an indice on trainer_name, 
as this field will be used on the join and on the group by.
For the other query, the one you get by substituting trainer_name by 
trainer_id, place an index on trainer_id.
Also, these indexes may help speed up the order by clause, if you use one.

If you have a table with lots of data you can try them around and see how 
performance varies (and don't forget there's also EXPLAIN ANALYZE)

Bear in mind, though, that this is just my 2 cents on a matter that I don't 
really master. Do not take this as an expert answer.

There are many people on this list that can help you better

Best,
Oliveiros

----- Original Message ----- 
From: "Tarlika Elisabeth Schmitz" <postgresql@numerixtechnology.de>
To: <pgsql-sql@postgresql.org>
Sent: Monday, September 27, 2010 5:54 PM
Subject: Re: [SQL] identifying duplicates in table with redundancies

> On Fri, 24 Sep 2010 18:12:18 +0100
> Oliver d'Azevedo Christina <oliveiros.cristina@gmail.com> wrote:
>
>>>> SELECT DISTINCT trainer_id,trainer_name
>>>> FROM (
>>>> SELECT trainer_name   -- The field you want to test for duplicates
>>>> FROM (
>>>> SELECT DISTINCT "trainer_id","trainer_name"
>>>> FROM student
>>>> ) x
>>>> GROUP BY "trainer_name"    -- the field you want to test for
>>>> duplicates
>>>> HAVING (COUNT(*) > 1)
>>>> ) z
>>>> NATURAL JOIN student y
>
>
>
> What indices would you recommend for this operation?
>
> -- 
>
> Best Regards,
> Tarlika Elisabeth Schmitz
>
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql

pgsql-sql by date:

From: gangadhar bandaru
Date: 28 September 2010, 06:02:27
Subject: eB7E2R

From: msi77
Date: 28 September 2010, 11:08:09
Subject: Re: is there a distinct function for comma lists ?

Re: identifying duplicates in table with redundancies - Mailing list pgsql-sql

Previous

Next