Thread: how to use aggregate functions in this case
Hi, Thats my code snipped: SELECT v_rec1.user, sum(CASE WHEN v_rec_fts.lev BETWEEN 0 AND 25 THEN 1 ELSE 0 END) as "0 to 25", sum(CASE WHEN v_rec_fts.lev BETWEEN 25 AND 50 THEN 1 ELSE 0 END) as "25 to 50", sum(CASE WHEN v_rec_fts.lev BETWEEN 50 AND 100 THEN 1 ELSE 0 END) as "50 to 100" INTO v_rec2 GROUP BY user; Now I want to summuarize the "0 to 25" values and the others in the same query. Somehow like this: count("0 to 25") But I want to do it with every single user and I don't know how to do that A result should look like this: user percentage count smith "0 to 25" 5 smith "25 to 50" 7 smith "50 to 75" 2 jones "0 to 25" 11 jones "25 to 50" 1 jones "50 to 75" 3 Hope someone who can help me Janek Sendrowski
Janek Sendrowski wrote > Hi, > > Thats my code snipped: > > SELECT v_rec1.user, > sum(CASE WHEN v_rec_fts.lev BETWEEN 0 AND 25 THEN 1 ELSE 0 END) as > "0 to 25", > sum(CASE WHEN v_rec_fts.lev BETWEEN 25 AND 50 THEN 1 ELSE 0 END) > as "25 to 50", > sum(CASE WHEN v_rec_fts.lev BETWEEN 50 AND 100 THEN 1 ELSE 0 END) > as "50 to 100" > INTO v_rec2 > GROUP BY user; > > Now I want to summuarize the "0 to 25" values and the others in the same > query. > Somehow like this: count("0 to 25") > But I want to do it with every single user and I don't know how to do that > > A result should look like this: > > user percentage count > smith "0 to 25" 5 > smith "25 to 50" 7 > smith "50 to 75" 2 > jones "0 to 25" 11 > jones "25 to 50" 1 > jones "50 to 75" 3 > > Hope someone who can help me > > Janek Sendrowski > > > -- > Sent via pgsql-general mailing list ( > pgsql-general@ > ) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general You have to write three queries (select statements) and then "UNION ALL" them together. Each sub-query has a where clause matching the range you wish to aggregate for. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768523.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Sorry, I formulated it wrong. My problem is, that I want to count the ranges for every user, but if I use count(range), it counts the ranges of allusers. Janek Sendrowski
Janek Sendrowski wrote > Sorry, I formulated it wrong. > > My problem is, that I want to count the ranges for every user, but if I > use count(range), it counts the ranges of all users. Assuming your example output is indeed what you desire: SELECT user, '0 to 25'::varchar AS percentage, count(*) FROM tbl WHERE val BETWEEN 0 AND 25 GROUP BY user UNION ALL SELECT user, '25 to 50'::varchar, count(*) FROM tbl WHERE val BETWEEN 25 AND 50 GROUP BY user ...and so forth Note that BETWEEN has inclusive end-points so anything with exactly 25 (for example) is being double-counted. Your other option is something like: SELECT user, percentage_range_category, count(*) FROM ( SELECT recordid, user, CASE .... END::varchar AS percentage_range_category ) categorize GROUP BY user, percentage_range_category; In short for every record you simply categorize the value then add that category to your group-by. Both are equally valid and the second one is probably easier to comprehend; the first option just happened to occur to me first. I have no idea which one would perform better in theory nor specifically with your data. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768525.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Sun, Aug 25, 2013 at 5:59 PM, Janek Sendrowski <janek12@web.de> wrote:
SELECT v_rec1.user,
sum(CASE WHEN v_rec_fts.lev BETWEEN 0 AND 25 THEN 1 ELSE 0 END) as "0 to 25",
sum(CASE WHEN v_rec_fts.lev BETWEEN 25 AND 50 THEN 1 ELSE 0 END) as "25 to 50",
sum(CASE WHEN v_rec_fts.lev BETWEEN 50 AND 100 THEN 1 ELSE 0 END) as "50 to 100"
INTO v_rec2
GROUP BY user;
Now I want to summuarize the "0 to 25" values and the others in the same query.
Somehow like this: count("0 to 25")
But I want to do it with every single user and I don't know how to do that
A result should look like this:
user percentage count
smith "0 to 25" 5
smith "25 to 50" 7
smith "50 to 75" 2
jones "0 to 25" 11
jones "25 to 50" 1
jones "50 to 75" 3
This appears to be some kind of equal interval problem.
SELECT v_rec1.user, COUNT(*) as count,
FROM v_rec2
GROUP BY user, bucket;
FROM v_rec2
GROUP BY user, bucket;
(Untested, but this should be the gist.)
Bucket 1 would be 0 to 25, bucket 2 is 25 to 50, 3 is 50 to 75, 4 is 75 to 100. If you really need to change the bucket number to some kind of text, you can probably nest this query inside another that uses a CASE to pick the text based on on the bucket number.
Good luck.On Sun, Aug 25, 2013 at 8:36 PM, BladeOfLight16 <bladeoflight16@gmail.com> wrote:
WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4) AS bucketThis appears to be some kind of equal interval problem.SELECT v_rec1.user,COUNT(*) as count,
FROM v_rec2
GROUP BY user, bucket;(Untested, but this should be the gist.)Bucket 1 would be 0 to 25, bucket 2 is 25 to 50, 3 is 50 to 75, 4 is 75 to 100. If you really need to change the bucket number to some kind of text, you can probably nest this query inside another that uses a CASE to pick the text based on on the bucket number.Good luck.
Then again, I guess you don't need a nested query.
SELECT v_rec1.user,
CASE WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4)
SELECT v_rec1.user,
CASE WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4)
WHEN 1 THEN '0 to 25'
WHEN 2 THEN '25 to 50'
WHEN 3 THEN '50 to 75'
WHEN 4 THEN '75 to 100'
ELSE 'But how?'
END CASE AS quarter_percentage
END CASE AS quarter_percentage
COUNT(*) as count,
FROM v_rec2
GROUP BY user, quarter_percentage;
FROM v_rec2
GROUP BY user, quarter_percentage;
BladeOfLight16 wrote > Then again, I guess you don't need a nested query. > > SELECT v_rec1.user, > CASE WIDTH_BUCKET(v_rec_fts.lev, 0, 100, 4) > WHEN 1 THEN '0 to 25' > WHEN 2 THEN '25 to 50' > WHEN 3 THEN '50 to 75' > WHEN 4 THEN '75 to 100' > ELSE 'But how?' > END CASE AS quarter_percentage > COUNT(*) as count, > FROM v_rec2 > GROUP BY user, quarter_percentage; This is clean but requires the use of equal intervals. Another option, though I am unfamiliar with the exact syntax, is to use the contains operator and an "intrange" (range type, for integer or whatever actual type is needed). SELECT CASE WHEN val @> '(0, 25]' -- '(25,50]', etc... This allows for uneven bucket sizes and avoid the double-inclusive endpoints problem that results from using BETWEEN. Requires 9.2 David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768573.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Hi, thanks for all your answers. I'll have a try with the contains operator and the intrange, but before I'd like to know if this would work: CASE WHEN a >= 0 AND a < 25 CASE WHEN a >= 25 AND a < 50 There wouldn't be a double endpoint. I just have to decide which range the endpoint includes. Janek Sendrowski
Janek Sendrowski wrote > Hi, > > > > thanks for all your answers. > > I'll have a try with the contains operator and the intrange, but > before I'd like to know if this would work: > > > > CASE WHEN a >= 0 AND a < 25 > > CASE WHEN a >= 25 AND a < 50 > > > > There wouldn't be a double endpoint. I just have to decide which range > the endpoint includes. > > > > Janek Sendrowski Yes, using explicit comparison operators with "AND" will work just fine; its just a little more verbose so the other options, if available, are preferred from a readability standpoint. I do not know whether the different options may have different performance characteristics. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768636.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
David Johnston wrote > > Janek Sendrowski wrote >> Hi, >> >> >> >> thanks for all your answers. >> >> I'll have a try with the contains operator and the intrange, but >> before I'd like to know if this would work: >> >> >> >> CASE WHEN a >= 0 AND a < 25 >> >> CASE WHEN a >= 25 AND a < 50 >> >> >> >> There wouldn't be a double endpoint. I just have to decide which >> range the endpoint includes. >> >> >> >> Janek Sendrowski > Yes, using explicit comparison operators with "AND" will work just fine; > its just a little more verbose so the other options, if available, are > preferred from a readability standpoint. I do not know whether the > different options may have different performance characteristics. > > David J. And just for completeness: WITH range_def (low, high, label) AS ( VALUES (0, 25, '0 to 25'::varchar), (25, 50, '25 to 50') ) SELECT ... FROM source_data JOIN range_def ON (val >= low AND val < high) [the rest of the query] David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/how-to-use-aggregate-functions-in-this-case-tp5768522p5768721.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.