Thread: distinct / group by assistance.

distinct / group by assistance.

From
Gavin 'Beau' Baumanis
Date:
Hi Everyone,

I am having some issues trying to create the required SQL that will  
allow me to return the results I am after.
I have been trying various incarnations, using group by, sub-queries -  
albeit to no avail - for the past three hours.


Consider the following simple SQL;

select    a.foo,    b.bar,    c.something
fromtable1 a,
inner jointable2 b on b.id =a.id
left outer jointable3 on c.id = a.id

If there a multiple rows of the same id in table1, I get all  
(multiple) rows - as you would expect - of course.

What I need however, is only one row returned per instance a.id that  
is returned by the above query.

I thought of using group by - but there are no calculated fields...  
and the real query contains 32 fields, which according to the errors I  
ran into while trying to get this working, would all need to be  
included in the group by clause.

So my understanding of group by is obviously a little dodgy - and  
obviously not quite what I was expecting.
I tried using a sub query and select distinct a.id..... but that  
didn't exactly help either.

I have deliberately included in my example the fact there is an outer  
join too - I am not sue if that matters or not... but just in case  
thought it prudent to include it here.


Thanks in advance for any help you might have.

Beau.


Re: distinct / group by assistance.

From
Tom Lane
Date:
"Gavin 'Beau' Baumanis" <gavinb@eclinic.com.au> writes:
> ... If there a multiple rows of the same id in table1, I get all  
> (multiple) rows - as you would expect - of course.

> What I need however, is only one row returned per instance a.id that  
> is returned by the above query.

You need GROUP BY a.id.

> I thought of using group by - but there are no calculated fields...  
> and the real query contains 32 fields, which according to the errors I  
> ran into while trying to get this working, would all need to be  
> included in the group by clause.

No, you wouldn't want to do that, because then you'd be back to multiple
rows per a.id value.

The problem here is that for any one a.id value there could be multiple
values of the other variables (coming from different rows) and so the
query results are not well defined if you just add "GROUP BY a.id".
What you need to do is decide which of those values you want and use
an aggregate function to get it.  So your query might end up looking
likeselect a.id, min(a.foo), avg(b.bar), ... from ... group by a.id;
        regards, tom lane


Re: distinct / group by assistance.

From
Gavin 'Beau' Baumanis
Date:
Hi Tom I am a nut.

please find below my correct requirements.

select   a.foo,   b.bar,   c.something
fromtable1 a,
inner jointable2 b on b.id =a.id
left outer jointable3 on c.id = a.id

If there a multiple rows of the same id in table2,(one (a) to many (b)  
relationship) I get all (multiple) rows - as you would expect - of course.

What I need however, is only one row returned per instance of a.id  
that is returned by the above query - the one with the greatest  
b.primaryKey would be ok.

I thought of using group by - but there are no calculated fields...  
and the real query contains 32 fields, which according to the errors I  
ran into while trying to get this working, would all need to be  
included in the group by clause.

So my understanding of group by is obviously a little dodgy - and  
obviously not quite what I was expecting.
I tried using a sub query and select distinct a.id..... but that  
didn't exactly help either.

I have deliberately included in my example the fact there is an outer  
join too - I am not sue if that matters or not... but just in case  
thought it prudent to include it here.
On 29/06/2008, at 1:43 AM, Tom Lane wrote:

> "Gavin 'Beau' Baumanis" <gavinb@eclinic.com.au> writes:
>> ... If there a multiple rows of the same id in table1, I get all
>> (multiple) rows - as you would expect - of course.
>
>> What I need however, is only one row returned per instance a.id that
>> is returned by the above query.
>
> You need GROUP BY a.id.
>
>> I thought of using group by - but there are no calculated fields...
>> and the real query contains 32 fields, which according to the  
>> errors I
>> ran into while trying to get this working, would all need to be
>> included in the group by clause.
>
> No, you wouldn't want to do that, because then you'd be back to  
> multiple
> rows per a.id value.
>
> The problem here is that for any one a.id value there could be  
> multiple
> values of the other variables (coming from different rows) and so the
> query results are not well defined if you just add "GROUP BY a.id".
> What you need to do is decide which of those values you want and use
> an aggregate function to get it.  So your query might end up looking
> like
>     select a.id, min(a.foo), avg(b.bar), ... from ... group by a.id;
>
>             regards, tom lane

Please contact me if you should have any questions.

Gavin 'Beau' Baumanis
Senior Application Developer
PalCare Pty. Ltd.

E: beau@palcare.com.au
T: +61 -3 9381 4567
M: +61 -4 38 545 586
W: http://www.palcare.com.au