Re: How pull - Mailing list pgsql-sql
From | Filip Rembiałkowski |
---|---|
Subject | Re: How pull |
Date | |
Msg-id | 92869e660709231815pe28db6cp6d1f132f16f2c181@mail.gmail.com Whole thread Raw |
In response to | How pull ("Matt Magoffin" <postgresql.org@msqr.us>) |
Responses |
Re: How pull
|
List | pgsql-sql |
On 23/09/2007, Matt Magoffin <postgresql.org@msqr.us> wrote: > Hello, > > My SQL skills are limited and I'm struggling with a query where I want to > return a single item of an aggregate join. The query looks like this: > > select > (case > when agg.avg_rating is null then 0.0 > when agg.avg_rating < 0.75 then 0.5 > when agg.avg_rating < 1.25 then 1.0 > when agg.avg_rating < 1.75 then 1.5 > when agg.avg_rating < 2.25 then 2.0 > when agg.avg_rating < 2.75 then 2.5 > when agg.avg_rating < 3.25 then 3.0 > when agg.avg_rating < 3.75 then 3.5 > when agg.avg_rating < 4.25 then 4.0 > when agg.avg_rating < 4.75 then 4.5 > else 5.0 > end) as avg_rating, > count(item.itemid) as item_count > from media_item item > inner join ( > select rating.mediaitem_userrating_hjid as ritemid, > avg(rating.rating) as avg_rating > from media_item_rating rating, media_item item > where rating.mediaitem_userrating_hjid = item.itemid > group by rating.mediaitem_userrating_hjid > ) as agg > on item.itemid = agg.ritemid > group by avg_rating > order by avg_rating desc > > and a sample of results is this: > > avg_rating | item_count > ------------+------------ > 5.0 | 21 > 4.0 | 33 > 3.0 | 13 > 2.0 | 4 > 1.0 | 1 > > What I want as well is the ID of the item (and possibly it's avg_rating > value) from the "agg" join with the highest avg_rating for each output > row... something like this > > avg_rating | item_count | item_id | item_rating > ------------+----------------------------------- > 5.0 | 21 | 109890 | 4.9 > 4.0 | 33 | 89201 | 4.1 > 3.0 | 13 | 119029 | 2.8 > 2.0 | 4 | 182999 | 2.2 > 1.0 | 1 | 1929 | 1.0 > > So the intention in this example is that item #109890 has an average > rating of 4.9 and that is the highest rating within the > 4.75 rating > group. > > If anyone had any tips I'd greatly appreciate it. > create ranking function to make queries look simpler: create or replace function ranking_group(numeric) returns numeric as $$ select case when $1 < 0.3456 then 'quite small' ... end $$ language sql immutable; (I'd make it STRICT, but you allow null rankings) 1st way: DISTINCT ON + subquery select *, (select count(*) from rating where rating_group(rating) = subq.rating_group ) as rating_group_size from ( select distinct on (rating_group) rating_group(r.rating), r.item_id as best_rated_item_id, r.rating as best_rating from rating r order by rating_group desc, r.rating desc ) subq; 2nd way (faster - actually 2 x faster) using FIRST aggregate to calculate all in one pass create function first(numeric,numeric) returns numeric as 'select $1' language sql immutable strict; create function first(integer,integer) returns integer as 'select $1' language sql immutable strict; create aggregate first( integer ) ( SFUNC = first, STYPE = integer ); create aggregate first ( numeric ) ( SFUNC = first, STYPE = numeric ); select rating_group(rating), count(*) as num_ratings, first(item_id) as best_rated_item_id, first(rating) as best_rating from ( select * from rating order by rating desc ) ordered_ratings group by rating_group order by rating_group desc; note: if you can, get rid of null ratings. what are they supposed to mean? they make things a bit more complicated. -- Filip Rembiałkowski