Thread: Determining Rank

Determining Rank

From
Don Drake
Date:
I have a query that shows the top N count(*)'s.

So it's basically:

select some_val, count(*) 
from big_table
group by some_val
order by count(*)
limit 50

Now, I would like to have the rank included in the result set.  The
first row would be 1, followed by 2, etc. all the way to 50.

I can do this in PHP, but stuffing it into the DB query makes other
things easier.

Any ideas?

Thanks.

-Don

-- 
Donald Drake
President
Drake Consulting
http://www.drakeconsult.com/
http://www.MailLaunder.com/
312-560-1574


Re: Determining Rank

From
Michael Fuhr
Date:
On Thu, Feb 03, 2005 at 09:06:36PM -0600, Don Drake wrote:

> select some_val, count(*) 
> from big_table
> group by some_val
> order by count(*)
> limit 50
> 
> Now, I would like to have the rank included in the result set.  The
> first row would be 1, followed by 2, etc. all the way to 50.

Maybe use a temporary sequence?

CREATE TEMPORARY SEQUENCE rank_seq;

SELECT nextval('rank_seq') AS rank, *
FROM (SELECT some_val, count(*)     FROM big_table     GROUP BY some_val     ORDER BY count(*)     LIMIT 50) AS s;

DROP SEQUENCE rank_seq;

I don't know if row order is guaranteed to survive a subquery,
however.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/


Re: Determining Rank

From
Michael Glaesemann
Date:
On Feb 4, 2005, at 12:06, Don Drake wrote:

> I have a query that shows the top N count(*)'s.
>
> So it's basically:
>
> select some_val, count(*)
> from big_table
> group by some_val
> order by count(*)
> limit 50
>
> Now, I would like to have the rank included in the result set.  The
> first row would be 1, followed by 2, etc. all the way to 50.

There are a couple of different ways to go about this. One is just to 
append an extra column that's basically a line number, but I find it 
doesn't handle ties very elegantly. The following example uses a 
correlated subquery using HAVING to determine the rank as "the number 
of items that have a total quantity greater than the current item + 1". 
Note that items bar and baz have exactly the same totals and are tied, 
while the rank of bat shows that there are 3 items that have totals 
greater than bat.

Joe Celko's "SQL for Smarties" has a bunch of things like this in it. 
I've found it quite helpful.

Hope this helps.

Michael Glaesemann
grzm myrealbox com

create table items (    item text not null    , qty integer not null
) without oids;

insert into items (item, qty) values ('foo', 1);
insert into items (item, qty) values ('foo', 2);
insert into items (item, qty) values ('foo', 1);
insert into items (item, qty) values ('foo', 3);
insert into items (item, qty) values ('foo', 3);
insert into items (item, qty) values ('foo', 20);
insert into items (item, qty) values ('foo', 1);
insert into items (item, qty) values ('bar', 3);
insert into items (item, qty) values ('bar', 1);
insert into items (item, qty) values ('bar', 3);
insert into items (item, qty) values ('bar', 13);
insert into items (item, qty) values ('baz', 2);
insert into items (item, qty) values ('baz', 4);
insert into items (item, qty) values ('baz', 14);
insert into items (item, qty) values ('bat', 3);
insert into items (item, qty) values ('bat', 4);

select item, sum(qty) as tot_qty
from items
group by item
order by tot_qty desc;
 item | tot_qty
------+--------- foo  |      31 bar  |      20 baz  |      20 bat  |       7
(4 rows)

select i1.item    , i1.tot_qty    , ( select count(*)        from (            select item                , sum(qty) as
tot_qty           from items            group by item            having sum(qty) > i1.tot_qty            ) as i2
)+ 1 as rank
 
from (    select item        , sum(qty) as tot_qty    from items    group by item    ) as i1
order by i1.tot_qty desc;
 item | tot_qty | rank
------+---------+------ foo  |      31 |    1 bar  |      20 |    2 baz  |      20 |    2 bat  |       7 |    4
(4 rows)



Re: Determining Rank

From
Don Drake
Date:
Michael,

That's an excellent solution, but on my table, the explain plan sucks
and the query time is over 3 minutes when implemented.  Is there a
simple way to get a row_num without using a temporary sequence?

Thanks for your help.

-Don


On Fri, 4 Feb 2005 14:02:20 +0900, Michael Glaesemann
<grzm@myrealbox.com> wrote:
> 
> On Feb 4, 2005, at 12:06, Don Drake wrote:
> 
> > I have a query that shows the top N count(*)'s.
> >
> > So it's basically:
> >
> > select some_val, count(*)
> > from big_table
> > group by some_val
> > order by count(*)
> > limit 50
> >
> > Now, I would like to have the rank included in the result set.  The
> > first row would be 1, followed by 2, etc. all the way to 50.
> 
> There are a couple of different ways to go about this. One is just to
> append an extra column that's basically a line number, but I find it
> doesn't handle ties very elegantly. The following example uses a
> correlated subquery using HAVING to determine the rank as "the number
> of items that have a total quantity greater than the current item + 1".
> Note that items bar and baz have exactly the same totals and are tied,
> while the rank of bat shows that there are 3 items that have totals
> greater than bat.
> 
> Joe Celko's "SQL for Smarties" has a bunch of things like this in it.
> I've found it quite helpful.
> 
> Hope this helps.
> 
> Michael Glaesemann
> grzm myrealbox com
> 
> create table items (
>      item text not null
>      , qty integer not null
> ) without oids;
> 
> insert into items (item, qty) values ('foo', 1);
> insert into items (item, qty) values ('foo', 2);
> insert into items (item, qty) values ('foo', 1);
> insert into items (item, qty) values ('foo', 3);
> insert into items (item, qty) values ('foo', 3);
> insert into items (item, qty) values ('foo', 20);
> insert into items (item, qty) values ('foo', 1);
> insert into items (item, qty) values ('bar', 3);
> insert into items (item, qty) values ('bar', 1);
> insert into items (item, qty) values ('bar', 3);
> insert into items (item, qty) values ('bar', 13);
> insert into items (item, qty) values ('baz', 2);
> insert into items (item, qty) values ('baz', 4);
> insert into items (item, qty) values ('baz', 14);
> insert into items (item, qty) values ('bat', 3);
> insert into items (item, qty) values ('bat', 4);
> 
> select item, sum(qty) as tot_qty
> from items
> group by item
> order by tot_qty desc;
> 
>   item | tot_qty
> ------+---------
>   foo  |      31
>   bar  |      20
>   baz  |      20
>   bat  |       7
> (4 rows)
> 
> select i1.item
>      , i1.tot_qty
>      , ( select count(*)
>          from (
>              select item
>                  , sum(qty) as tot_qty
>              from items
>              group by item
>              having sum(qty) > i1.tot_qty
>              ) as i2
>          ) + 1 as rank
> from (
>      select item
>          , sum(qty) as tot_qty
>      from items
>      group by item
>      ) as i1
> order by i1.tot_qty desc;
> 
>   item | tot_qty | rank
> ------+---------+------
>   foo  |      31 |    1
>   bar  |      20 |    2
>   baz  |      20 |    2
>   bat  |       7 |    4
> (4 rows)
> 
> 


-- 
Donald Drake
President
Drake Consulting
http://www.drakeconsult.com/
http://www.MailLaunder.com/
312-560-1574


Re: Determining Rank

From
PFC
Date:
> Michael,
>
> That's an excellent solution, but on my table, the explain plan sucks
> and the query time is over 3 minutes when implemented.  Is there a
> simple way to get a row_num without using a temporary sequence?
>
> Thanks for your help.
>
> -Don
Make your query a set returning function which iterates over the query  
results and returns a number as well ?