Thread: Grouping, Aggregate, Min, Max
Hi All,
I am not sure how to define with words what I want to accomplish (so can't ask google the right question :) )
So will try to explain with sample data and expected result:
Scenario 1)
id | thing_id | category | period_id |
1 | 1 | A | 1 |
2 | 1 | A | 2 |
3 | 1 | A | 3 |
4 | 1 | A | 4 |
5 | 1 | A | 5 |
6 | 1 | A | 6 |
7 | 1 | A | 7 |
8 | 1 | A | 8 |
9 | 1 | A | 9 |
10 | 2 | A | 1 |
11 | 2 | A | 2 |
12 | 2 | A | 3 |
13 | 2 | A | 4 |
Expected result:
thing_id | category | periods |
1 | A | 1-9 |
2 | A | 1-4 |
(Sounds easy, group by, thing_id, category use Min and Max for period id - but further scenarios makes it a bit complicated...)
Scenario 2)
id | thing_id | category | period_id |
1 | 1 | A | 1 |
2 | 1 | A | 2 |
3 | 1 | A | 3 |
4 | 1 | A | 4 |
5 | 1 | B | 5 |
6 | 1 | B | 6 |
7 | 1 | B | 7 |
8 | 1 | A | 8 |
9 | 1 | A | 9 |
10 | 2 | A | 1 |
11 | 2 | A | 2 |
12 | 2 | A | 3 |
13 | 2 | A | 4 |
Expected result:
thing_id | category | periods |
1 | A | 1-4, 8-9 |
1 | B | 5-7 |
2 | A | 1-4 |
Scenario 3)
id | thing_id | category | period_id |
1 | 1 | A | 1 |
2 | 1 | A | 2 |
3 | 1 | A | 3 |
4 | 1 | A | 7 |
5 | 1 | A | 8 |
6 | 1 | A | 9 |
7 | 2 | A | 1 |
8 | 2 | A | 2 |
9 | 2 | A | 3 |
10 | 2 | A | 4 |
Expected result:
thing_id | category | periods |
1 | A | 1-3, 7-9 |
2 | A | 1-4 |
So goal is, to group by thing_id, category id - but if period_id is interupted (not in incremented by 1) to have aggregated spans...
To desired results we have came up using several CTE's (what makes a query a bit big, and more "procedural way": make cte what calculated diff between current and previous row, next cte uses previous one to define groupings, next cte to make aggregates etc...)
So I wonder - is there some kind of aggregate window function what does desired results?
Many Thanks,
Misa
There is a trick to simplify the thing and avoid using aggregates :
I think it will give you your answer.
Cheers,
Rémi-C
2013/12/13 Misa Simic <misa.simic@gmail.com>
Hi All,I am not sure how to define with words what I want to accomplish (so can't ask google the right question :) )So will try to explain with sample data and expected result:Scenario 1)
id thing_id category period_id 1 1 A 1 2 1 A 2 3 1 A 3 4 1 A 4 5 1 A 5 6 1 A 6 7 1 A 7 8 1 A 8 9 1 A 9 10 2 A 1 11 2 A 2 12 2 A 3 13 2 A 4 Expected result:
thing_id category periods 1 A 1-9 2 A 1-4 (Sounds easy, group by, thing_id, category use Min and Max for period id - but further scenarios makes it a bit complicated...)Scenario 2)
id thing_id category period_id 1 1 A 1 2 1 A 2 3 1 A 3 4 1 A 4 5 1 B 5 6 1 B 6 7 1 B 7 8 1 A 8 9 1 A 9 10 2 A 1 11 2 A 2 12 2 A 3 13 2 A 4 Expected result:
thing_id category periods 1 A 1-4, 8-9 1 B 5-7 2 A 1-4 Scenario 3)
id thing_id category period_id 1 1 A 1 2 1 A 2 3 1 A 3 4 1 A 7 5 1 A 8 6 1 A 9 7 2 A 1 8 2 A 2 9 2 A 3 10 2 A 4 Expected result:
thing_id category periods 1 A 1-3, 7-9 2 A 1-4 So goal is, to group by thing_id, category id - but if period_id is interupted (not in incremented by 1) to have aggregated spans...To desired results we have came up using several CTE's (what makes a query a bit big, and more "procedural way": make cte what calculated diff between current and previous row, next cte uses previous one to define groupings, next cte to make aggregates etc...)So I wonder - is there some kind of aggregate window function what does desired results?Many Thanks,Misa
Thanks Rémi-C,
Well, not sure is it a goal to avoid aggregates...
Bellow problem/solution even works (not sure) I guess would produce (if we imagine instead of count it use min and max in a row, though this case a bit complicated because of it should take real values from the source table, because of simplicity it starts in examples from 1 - what is not always the case...)
1 | A | 1-4 |
1 | B | 5-7 |
1 | A | 8-9 |
so next step, is to aggregate1-4,8-9 in one row
I think we have used similar approach, what with a few CTE's provides desired result, just think would be simpler with 1 windowed aggregate function.... i.e.
for:
c1, c2
A 1
A 2
A 3
SELECT DISTINCT c1, custom_agg_function(c2) OVER (PARTITION BY c1 ORDER BY c2)
result
a, 1 -3
in case
c1, c2
A 1
A 2
A 3
A 5
result:
A , 1-3, 5-5
thanks,
Misa
2013/12/13 Rémi Cura <remi.cura@gmail.com>
There is a trick to simplify the thing and avoid using aggregates :I think it will give you your answer.Cheers,Rémi-C2013/12/13 Misa Simic <misa.simic@gmail.com>Hi All,I am not sure how to define with words what I want to accomplish (so can't ask google the right question :) )So will try to explain with sample data and expected result:Scenario 1)
id thing_id category period_id 1 1 A 1 2 1 A 2 3 1 A 3 4 1 A 4 5 1 A 5 6 1 A 6 7 1 A 7 8 1 A 8 9 1 A 9 10 2 A 1 11 2 A 2 12 2 A 3 13 2 A 4 Expected result:
thing_id category periods 1 A 1-9 2 A 1-4 (Sounds easy, group by, thing_id, category use Min and Max for period id - but further scenarios makes it a bit complicated...)Scenario 2)
id thing_id category period_id 1 1 A 1 2 1 A 2 3 1 A 3 4 1 A 4 5 1 B 5 6 1 B 6 7 1 B 7 8 1 A 8 9 1 A 9 10 2 A 1 11 2 A 2 12 2 A 3 13 2 A 4 Expected result:
thing_id category periods 1 A 1-4, 8-9 1 B 5-7 2 A 1-4 Scenario 3)
id thing_id category period_id 1 1 A 1 2 1 A 2 3 1 A 3 4 1 A 7 5 1 A 8 6 1 A 9 7 2 A 1 8 2 A 2 9 2 A 3 10 2 A 4 Expected result:
thing_id category periods 1 A 1-3, 7-9 2 A 1-4 So goal is, to group by thing_id, category id - but if period_id is interupted (not in incremented by 1) to have aggregated spans...To desired results we have came up using several CTE's (what makes a query a bit big, and more "procedural way": make cte what calculated diff between current and previous row, next cte uses previous one to define groupings, next cte to make aggregates etc...)So I wonder - is there some kind of aggregate window function what does desired results?Many Thanks,Misa
Re:custom aggregate: I'd probably try building a two dimensional array in the state transition function. Take the new value and check if it is adjacent to the last value in the last bin of the current state. If so add it to that bin. If not create a new bin and store it there. Requires sorted input. You could also just store all the values encountered and at the end group them into bins after sorting internally. That way you just need to get the partition right - not the order by and sub-groups. The final output is just a call to string_agg though you would have to unnest the array in a custom manner since unnest() flattens multiple-dimensional arrays. See a recent thread for specifics. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Grouping-Aggregate-Min-Max-tp5783279p5783318.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Misa Simic <misa.simic@gmail.com> wrote: > So I wonder - is there some kind of aggregate window function > what does desired results? Not built in, but PostgreSQL makes it pretty easy to do so. With a little effort to define your own aggregate function, your query can look like this: SELECT thing_id, category, int4range_list(period_id) FROM thing GROUP BY thing_id, category ORDER BY 1, 2; I've attached a couple files -- one which creates the desired aggregate function, and the other loads a table with two of your sample data sets and runs the above. This is just intended as a quick example of the capabilities available to you. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
2013/12/13 Kevin Grittner <kgrittn@ymail.com>
Misa Simic <misa.simic@gmail.com> wrote:Not built in, but PostgreSQL makes it pretty easy to do so. With a
> So I wonder - is there some kind of aggregate window function
> what does desired results?
little effort to define your own aggregate function, your query can
look like this:
SELECT
thing_id,
category,
int4range_list(period_id)
FROM thing
GROUP BY thing_id, category
ORDER BY 1, 2;
I've attached a couple files -- one which creates the desired
aggregate function, and the other loads a table with two of your
sample data sets and runs the above. This is just intended as a
quick example of the capabilities available to you.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Many thanks Kevin! :)
It simply works - perfect! :)
Many thanks,
Misa