Thread: hpw to Count without group by

hpw to Count without group by

From
Yudie Pg
Date:
Hello,
I have a table, structure like this:
 
create table product(
 sku, int4 not null,
 category int4 null,
 display_name varchar(100) null,
 rank int4 null
)
 
let say example data:
sku, category, display_name
=======================
10001, 5, postgresql, 132
10002, 5, mysql, 243
10003, 5, oracle, 323
10006, 7, photoshop, 53 
10007, 7, flash mx, 88
10008, 9, Windows XP, 44
10008, 9, Linux, 74
 
Expected query result:
 
sku, category, display_name, category_count
====================================
10001, 5, postgresql, 3
10006, 7, photoshop, 2
10008, 9, Windows XP, 2
 
The idea is getting getting highest ranking each product category and COUNT how many products in the category with SINGLE query.
 
the first 3 columns can be done with select distinct on (category) ... order by category, rank desc but it still missing the category_count. I wish no subquery needed for having simplest query plan.
 
 
Thank you. 
 
 
Yudie G.

Re: hpw to Count without group by

From
Ragnar Hafstað
Date:
On Wed, 2005-06-01 at 16:16 -0500, Yudie Pg wrote:
> Hello,
> I have a table, structure like this:

 [...]

> Expected query result:
>
> sku, category, display_name, category_count
> ====================================
> 10001, 5, postgresql, 3
> 10006, 7, photoshop, 2
> 10008, 9, Windows XP, 2
>
> The idea is getting getting highest ranking each product category and
> COUNT how many products in the category with SINGLE query.
>
> the first 3 columns can be done with select distinct on (category) ...
> order by category, rank desc but it still missing the category_count.
> I wish no subquery needed for having simplest query plan.

how about a simple join ?

select sku,category,display_name,count
    from
         (select distinct on (category) category, sku,display_name
                from product order by category,rank
         ) as foo
    natural join
          (select category,count(*) as count
                from product group by category
          ) as bar;

gnari



Re: hpw to Count without group by

From
Edmund Bacon
Date:
yudiepg@gmail.com (Yudie Pg) writes:

> Hello,
> I have a table, structure like this:
>  create table product(
>  sku, int4 not null,
>  category int4 null,
>  display_name varchar(100) null,
>  rank int4 null
> )
>  let say example data:
> sku, category, display_name
> =======================
> 10001, 5, postgresql, 132
> 10002, 5, mysql, 243
> 10003, 5, oracle, 323
> 10006, 7, photoshop, 53
> 10007, 7, flash mx, 88
> 10008, 9, Windows XP, 44
>  10008, 9, Linux, 74
>  Expected query result:
>  sku, category, display_name, category_count
> ====================================
> 10001, 5, postgresql, 3
>  10006, 7, photoshop, 2
>  10008, 9, Windows XP, 2
>  The idea is getting getting highest ranking each product category and COUNT
> how many products in the category with SINGLE query.
>  the first 3 columns can be done with select distinct on (category) ...
> order by category, rank desc but it still missing the category_count. I wish
> no subquery needed for having simplest query plan.
>   Thank you.
>   Yudie G.

I do not believe you can do this without a subquery - you are trying
to get 2 separate pieces of information from your data
   * some data about the record having MAX(rank) for each category
and
   * the count of records in each category

Note, however that you can get MAX(rank) and COUNT(category) in one
sequential pass of the data: e.g
 SELECT category, MAX(rank), COUNT(category) FROM product;

Joining this with the orignal table is not too dificult :

SELECT sku, category, display_name, category_count
  FROM  product
  JOIN (SELECT category, MAX(rank) AS rank, COUNT(category) AS category_count
               FROM product
               GROUP BY category) subq
        USING(category, rank)
 ORDER BY sku;

Depending on what your data looks like, you might improve things by
having an index on category, and perhaps on (category, rank).

Note that there is may be a problem with this query: If you have more
than one product with the same rank in the same category, you may get
more than one record for that category.  Apply distinct on as
neccessary.

--
Remove -42 for email

Re: hpw to Count without group by

From
Yudie Pg
Date:
I do not believe you can do this without a subquery - you are trying
to get 2 separate pieces of information from your data
  * some data about the record having MAX(rank) for each category
and
  * the count of records in each category
 
Hi, I guess i try to answer my own question which end up with creating stored procedure.
Unless you have direct query idea.
 
This function cut the half of query time, as my concern about postgres count agregate function is always slower than I expected.
 
SQL:

CREATE TYPE product_type as
(sku int4, category int4, display_name varchar(100),rank int4, category_count);

CREATE OR REPLACE FUNCTION get_toprank_product_category (text) returns setof product_type
as '
DECLARE
   kwd ALIAS for $1;
   mrow RECORD;
   retrow prdtcat_searchresult;
   tempcount int4;
   prevcatnum int4 ;
   i int4;
BEGIN
   tempcount = 0;
   prevcatnum := 0;
   I:=0;
   FOR tbrow IN
      select * from product order by category, rank
   LOOP
      i := i+1;

     IF prevcatnum != mrow.catnum OR i = 1 THEN
           prevcatnum := mrow.catnum;
       if i > 1 THEN

           RETURN NEXT retrow;

       END IF;
       retrow.catnum := mrow.catnum;
       retrow.corenum :=mrow.corenum;
       retrow. mernum := mrow.mernum;
       retrow.mersku := mrow.mersku;

       tempcount = 1;
        retrow.catcount := tempcount;
        prevcatnum := mrow.catnum;
     ELSE
        tempcount := tempcount + 1;
        retrow.catcount := tempcount;
       
      END IF;
   END LOOP;
   RETURN NEXT retrow;
 
   RETURN;
END'
language 'PLPGSQL';