Thread: hpw to Count without group by
Hello,
I have a table, structure like this:
create table product(
sku, int4 not null,
category int4 null,
display_name varchar(100) null,
rank int4 null
)
let say example data:
sku, category, display_name
=======================
10001, 5, postgresql, 132
10002, 5, mysql, 243
10003, 5, oracle, 323
10006, 7, photoshop, 53
10007, 7, flash mx, 88
10008, 9, Windows XP, 44
10008, 9, Linux, 74
Expected query result:
sku, category, display_name, category_count
====================================
10001, 5, postgresql, 3
10006, 7, photoshop, 2
10008, 9, Windows XP, 2
The idea is getting getting highest ranking each product category and COUNT how many products in the category with SINGLE query.
the first 3 columns can be done with select distinct on (category) ... order by category, rank desc but it still missing the category_count. I wish no subquery needed for having simplest query plan.
Thank you.
Yudie G.
On Wed, 2005-06-01 at 16:16 -0500, Yudie Pg wrote: > Hello, > I have a table, structure like this: [...] > Expected query result: > > sku, category, display_name, category_count > ==================================== > 10001, 5, postgresql, 3 > 10006, 7, photoshop, 2 > 10008, 9, Windows XP, 2 > > The idea is getting getting highest ranking each product category and > COUNT how many products in the category with SINGLE query. > > the first 3 columns can be done with select distinct on (category) ... > order by category, rank desc but it still missing the category_count. > I wish no subquery needed for having simplest query plan. how about a simple join ? select sku,category,display_name,count from (select distinct on (category) category, sku,display_name from product order by category,rank ) as foo natural join (select category,count(*) as count from product group by category ) as bar; gnari
yudiepg@gmail.com (Yudie Pg) writes: > Hello, > I have a table, structure like this: > create table product( > sku, int4 not null, > category int4 null, > display_name varchar(100) null, > rank int4 null > ) > let say example data: > sku, category, display_name > ======================= > 10001, 5, postgresql, 132 > 10002, 5, mysql, 243 > 10003, 5, oracle, 323 > 10006, 7, photoshop, 53 > 10007, 7, flash mx, 88 > 10008, 9, Windows XP, 44 > 10008, 9, Linux, 74 > Expected query result: > sku, category, display_name, category_count > ==================================== > 10001, 5, postgresql, 3 > 10006, 7, photoshop, 2 > 10008, 9, Windows XP, 2 > The idea is getting getting highest ranking each product category and COUNT > how many products in the category with SINGLE query. > the first 3 columns can be done with select distinct on (category) ... > order by category, rank desc but it still missing the category_count. I wish > no subquery needed for having simplest query plan. > Thank you. > Yudie G. I do not believe you can do this without a subquery - you are trying to get 2 separate pieces of information from your data * some data about the record having MAX(rank) for each category and * the count of records in each category Note, however that you can get MAX(rank) and COUNT(category) in one sequential pass of the data: e.g SELECT category, MAX(rank), COUNT(category) FROM product; Joining this with the orignal table is not too dificult : SELECT sku, category, display_name, category_count FROM product JOIN (SELECT category, MAX(rank) AS rank, COUNT(category) AS category_count FROM product GROUP BY category) subq USING(category, rank) ORDER BY sku; Depending on what your data looks like, you might improve things by having an index on category, and perhaps on (category, rank). Note that there is may be a problem with this query: If you have more than one product with the same rank in the same category, you may get more than one record for that category. Apply distinct on as neccessary. -- Remove -42 for email
I do not believe you can do this without a subquery - you are trying
to get 2 separate pieces of information from your data
* some data about the record having MAX(rank) for each category
and
* the count of records in each category
Hi, I guess i try to answer my own question which end up with creating stored procedure.
Unless you have direct query idea.
This function cut the half of query time, as my concern about postgres count agregate function is always slower than I expected.
SQL:
CREATE TYPE product_type as
(sku int4, category int4, display_name varchar(100),rank int4, category_count);
CREATE OR REPLACE FUNCTION get_toprank_product_category (text) returns setof product_type
as '
DECLARE
kwd ALIAS for $1;
mrow RECORD;
retrow prdtcat_searchresult;
tempcount int4;
prevcatnum int4 ;
i int4;
BEGIN
tempcount = 0;
prevcatnum := 0;
I:=0;
FOR tbrow IN
select * from product order by category, rank
as '
DECLARE
kwd ALIAS for $1;
mrow RECORD;
retrow prdtcat_searchresult;
tempcount int4;
prevcatnum int4 ;
i int4;
BEGIN
tempcount = 0;
prevcatnum := 0;
I:=0;
FOR tbrow IN
select * from product order by category, rank
LOOP
i := i+1;
i := i+1;
IF prevcatnum != mrow.catnum OR i = 1 THEN
prevcatnum := mrow.catnum;
if i > 1 THEN
RETURN NEXT retrow;
END IF;
retrow.catnum := mrow.catnum;
retrow.corenum :=mrow.corenum;
retrow. mernum := mrow.mernum;
retrow.mersku := mrow.mersku;
tempcount = 1;
retrow.catcount := tempcount;
prevcatnum := mrow.catnum;
ELSE
tempcount := tempcount + 1;
retrow.catcount := tempcount;
END IF;
END LOOP;
RETURN NEXT retrow;
RETURN;
END'
language 'PLPGSQL';