Thread: Subsorting GROUP BY data

Subsorting GROUP BY data

From
"Johnson, Michael L."
Date:
Given the following table:

ID  |  Cat  |  Num
----|-------|------
Z   |   A   |   0
Y   |   A   |   1
X   |   A   |   2
W   |   B   |   0
V   |   B   |   1
U   |   B   |   2
T   |   C   |   0
S   |   C   |   1
R   |   C   |   2

I want to do this:  Group the items by the cat field.  Then select the
ID where the num is the highest in the group; so it should return
something like:

Cat  |  ID  |  Num
-----|------|------ A  |  X   |   2 B  |  U   |   2 C  |  R   |   2


Using SQL like this, I can get the category and the highest # in the
category:

SELECT cat, MAX(num) FROM my_table GROUP_BY cat;

But if I add the "id" column, of course it doesn't work, since it's not
in an aggregate function or in the GROUP_BY clause.  So I found a post
at http://archives.postgresql.org/pgsql-hackers/2006-03/msg01324.php
which describes how to add a "FIRST" and "LAST" aggregate function to
PGSQL.  However, first and last don't seem to help unless you are able
to "subsort" the grouping by the # (ie, group by cat, then subsort on
num, and select the "last" one of the group).

I would think something like the following would work, except that PGSQL
does not like the SQL generated (it basically says I can't have a
GROUP_BY after an ORDER_BY).  And if I move the "ORDER_BY" to the end,
that just orders the returned groupings, so that doesn't help me either.

SELECT cat, LAST(id), LAST(num) FROM my_table ORDER_BY num GROUP_BY cat;


So does anyone know how to sort *within* a grouping so that FIRST and
LAST return meaningful results?

Thanks in advance,
Mike Johnson


Re: Subsorting GROUP BY data

From
"Oliveiros Cristina"
Date:
If it is to Group the items by cat field then select the ID where the num is 
the highest in group,

You could maybe try

SELECT a.ID, b.Cat,b.Num
FROM my_table a
JOIN ( SELECT cat, MAX(num) as maximo FROM my_table GROUP_BY cat) b
ON a.Cat = b.Cat
AND a.Num = b.maximo;

It 'll probably give what you need (with minor fixes...)

Best,
Oliveiros


----- Original Message ----- 
From: "Johnson, Michael L." <michael.l.johnson@ngc.com>
To: <pgsql-sql@postgresql.org>
Sent: Monday, November 10, 2008 2:56 PM
Subject: [SQL] Subsorting GROUP BY data


Given the following table:

ID  |  Cat  |  Num
----|-------|------
Z   |   A   |   0
Y   |   A   |   1
X   |   A   |   2
W   |   B   |   0
V   |   B   |   1
U   |   B   |   2
T   |   C   |   0
S   |   C   |   1
R   |   C   |   2

I want to do this:  Group the items by the cat field.  Then select the
ID where the num is the highest in the group; so it should return
something like:

Cat  |  ID  |  Num
-----|------|------ A  |  X   |   2 B  |  U   |   2 C  |  R   |   2


Using SQL like this, I can get the category and the highest # in the
category:

SELECT cat, MAX(num) FROM my_table GROUP_BY cat;

But if I add the "id" column, of course it doesn't work, since it's not
in an aggregate function or in the GROUP_BY clause.  So I found a post
at http://archives.postgresql.org/pgsql-hackers/2006-03/msg01324.php
which describes how to add a "FIRST" and "LAST" aggregate function to
PGSQL.  However, first and last don't seem to help unless you are able
to "subsort" the grouping by the # (ie, group by cat, then subsort on
num, and select the "last" one of the group).

I would think something like the following would work, except that PGSQL
does not like the SQL generated (it basically says I can't have a
GROUP_BY after an ORDER_BY).  And if I move the "ORDER_BY" to the end,
that just orders the returned groupings, so that doesn't help me either.

SELECT cat, LAST(id), LAST(num) FROM my_table ORDER_BY num GROUP_BY cat;


So does anyone know how to sort *within* a grouping so that FIRST and
LAST return meaningful results?

Thanks in advance,
Mike Johnson

-- 
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql



Re: Subsorting GROUP BY data

From
"Fernando Hevia"
Date:
> -----Mensaje original-----
> De: pgsql-sql-owner@postgresql.org 
> [mailto:pgsql-sql-owner@postgresql.org] En nombre de Johnson, 
> Michael L.
> Enviado el: Lunes, 10 de Noviembre de 2008 12:57
> Para: pgsql-sql@postgresql.org
> Asunto: [SQL] Subsorting GROUP BY data
> 
> Given the following table:
> 
> ID  |  Cat  |  Num
> ----|-------|------
> Z   |   A   |   0
> Y   |   A   |   1
> X   |   A   |   2
> W   |   B   |   0
> V   |   B   |   1
> U   |   B   |   2
> T   |   C   |   0
> S   |   C   |   1
> R   |   C   |   2
> 
> I want to do this:  Group the items by the cat field.  Then 
> select the ID where the num is the highest in the group; so 
> it should return something like:
> 
> Cat  |  ID  |  Num
> -----|------|------
>   A  |  X   |   2
>   B  |  U   |   2
>   C  |  R   |   2
> 
> 
> Using SQL like this, I can get the category and the highest # in the
> category:
> 
> SELECT cat, MAX(num) FROM my_table GROUP_BY cat;
> 
> But if I add the "id" column, of course it doesn't work, 
> since it's not in an aggregate function or in the GROUP_BY 
> clause.  So I found a post at 
> http://archives.postgresql.org/pgsql-hackers/2006-03/msg01324.php
> which describes how to add a "FIRST" and "LAST" aggregate 
> function to PGSQL.  However, first and last don't seem to 
> help unless you are able to "subsort" the grouping by the # 
> (ie, group by cat, then subsort on num, and select the "last" 
> one of the group).
> 

I wonder if this suites you:

SELECT sub.cat, t.id, sub.Num FROM my_table t, ( SELECT cat, MAX(num) as Num FROM my_table GROUP_BY cat
) subWHERE t.cat = sub.cat AND t.Num = sub.Num
ORDER BY t.cat;


Regards,
Fernando.




Re: Subsorting GROUP BY data

From
tv@fuzzy.cz
Date:
What about replacing the table by

SELECT * FROM my_table ORDER BY num

i.e. something like

SELECT cat, LAST(id), LAST(num) FROM (SELECT * FROM my_table ORDER BY num)
AS foo GROUP_BY cat;

Hope it works, just guessing it might help :-)

regards
Tomas

> SELECT cat, MAX(num) FROM my_table GROUP_BY cat;
>
> But if I add the "id" column, of course it doesn't work, since it's not
> in an aggregate function or in the GROUP_BY clause.  So I found a post
> at http://archives.postgresql.org/pgsql-hackers/2006-03/msg01324.php
> which describes how to add a "FIRST" and "LAST" aggregate function to
> PGSQL.  However, first and last don't seem to help unless you are able
> to "subsort" the grouping by the # (ie, group by cat, then subsort on
> num, and select the "last" one of the group).
>
> I would think something like the following would work, except that PGSQL
> does not like the SQL generated (it basically says I can't have a
> GROUP_BY after an ORDER_BY).  And if I move the "ORDER_BY" to the end,
> that just orders the returned groupings, so that doesn't help me either.
>
> SELECT cat, LAST(id), LAST(num) FROM my_table ORDER_BY num GROUP_BY cat;
>
>
> So does anyone know how to sort *within* a grouping so that FIRST and
> LAST return meaningful results?



Re: Subsorting GROUP BY data

From
"Johnson, Michael L."
Date:
Thanks! That's perfect, because now I don't need the FIRST/LAST
aggregate functions!

Mike

-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Fernando Hevia
Sent: Monday, November 10, 2008 10:30 AM
To: Johnson, Michael L.; pgsql-sql@postgresql.org
Subject: Re: [SQL] Subsorting GROUP BY data


> -----Mensaje original-----
> De: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org] En nombre de Johnson, Michael
> L.
> Enviado el: Lunes, 10 de Noviembre de 2008 12:57
> Para: pgsql-sql@postgresql.org
> Asunto: [SQL] Subsorting GROUP BY data
>
> Given the following table:
>
> ID  |  Cat  |  Num
> ----|-------|------
> Z   |   A   |   0
> Y   |   A   |   1
> X   |   A   |   2
> W   |   B   |   0
> V   |   B   |   1
> U   |   B   |   2
> T   |   C   |   0
> S   |   C   |   1
> R   |   C   |   2
>
> I want to do this:  Group the items by the cat field.  Then select the

> ID where the num is the highest in the group; so it should return
> something like:
>
> Cat  |  ID  |  Num
> -----|------|------
>   A  |  X   |   2
>   B  |  U   |   2
>   C  |  R   |   2
>
>
> Using SQL like this, I can get the category and the highest # in the
> category:
>
> SELECT cat, MAX(num) FROM my_table GROUP_BY cat;
>
> But if I add the "id" column, of course it doesn't work, since it's
> not in an aggregate function or in the GROUP_BY clause.  So I found a
> post at
> http://archives.postgresql.org/pgsql-hackers/2006-03/msg01324.php
> which describes how to add a "FIRST" and "LAST" aggregate function to
> PGSQL.  However, first and last don't seem to help unless you are able

> to "subsort" the grouping by the # (ie, group by cat, then subsort on
> num, and select the "last"
> one of the group).
>

I wonder if this suites you:

SELECT sub.cat, t.id, sub.Num FROM my_table t, ( SELECT cat, MAX(num) as Num FROM my_table GROUP_BY
cat
) subWHERE t.cat = sub.cat AND t.Num = sub.Num
ORDER BY t.cat;


Regards,
Fernando.



--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql