yudiepg@gmail.com (Yudie Pg) writes:
> Hello,
> I have a table, structure like this:
> create table product(
> sku, int4 not null,
> category int4 null,
> display_name varchar(100) null,
> rank int4 null
> )
> let say example data:
> sku, category, display_name
> =======================
> 10001, 5, postgresql, 132
> 10002, 5, mysql, 243
> 10003, 5, oracle, 323
> 10006, 7, photoshop, 53
> 10007, 7, flash mx, 88
> 10008, 9, Windows XP, 44
> 10008, 9, Linux, 74
> Expected query result:
> sku, category, display_name, category_count
> ====================================
> 10001, 5, postgresql, 3
> 10006, 7, photoshop, 2
> 10008, 9, Windows XP, 2
> The idea is getting getting highest ranking each product category and COUNT
> how many products in the category with SINGLE query.
> the first 3 columns can be done with select distinct on (category) ...
> order by category, rank desc but it still missing the category_count. I wish
> no subquery needed for having simplest query plan.
> Thank you.
> Yudie G.
I do not believe you can do this without a subquery - you are trying
to get 2 separate pieces of information from your data
* some data about the record having MAX(rank) for each category
and
* the count of records in each category
Note, however that you can get MAX(rank) and COUNT(category) in one
sequential pass of the data: e.g
SELECT category, MAX(rank), COUNT(category) FROM product;
Joining this with the orignal table is not too dificult :
SELECT sku, category, display_name, category_count
FROM product
JOIN (SELECT category, MAX(rank) AS rank, COUNT(category) AS category_count
FROM product
GROUP BY category) subq
USING(category, rank)
ORDER BY sku;
Depending on what your data looks like, you might improve things by
having an index on category, and perhaps on (category, rank).
Note that there is may be a problem with this query: If you have more
than one product with the same rank in the same category, you may get
more than one record for that category. Apply distinct on as
neccessary.
--
Remove -42 for email