Thread: Grouping, Aggregate, Min, Max

Grouping, Aggregate, Min, Max

From
Misa Simic
Date:
Hi All,

I am not sure how to define with words what I want to accomplish (so can't ask google the right question :) )

So will try to explain with sample data and expected result:

Scenario 1)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A4
51A5
61A6
71A7
81A8
91A9
102A1
112A2
122A3
132A4



Expected result:

thing_idcategoryperiods
1A1-9
2A1-4

(Sounds easy, group by, thing_id, category use Min and Max for period id - but further scenarios makes it a bit complicated...)

Scenario 2)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A4
51B5
61B6
71B7
81A8
91A9
102A1
112A2
122A3
132A4

Expected result:
thing_idcategoryperiods
1A1-4, 8-9
1B5-7
2A1-4

Scenario 3)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A7
51A8
61A9
72A1
82A2
92A3
102A4

Expected result:

thing_idcategoryperiods
1A1-3, 7-9
2A1-4



So goal is, to group by thing_id, category id - but if period_id is interupted (not in incremented by 1) to have aggregated spans...

To desired results we have came up using several CTE's (what makes a query a bit big, and more "procedural way": make cte what calculated diff between current and previous row, next cte uses previous one to define groupings, next cte to make aggregates etc...)

So I wonder - is there some kind of aggregate window function what does desired results?


Many Thanks,

Misa

Re: Grouping, Aggregate, Min, Max

From
Rémi Cura
Date:
There is a trick to simplify the thing and avoid using aggregates :

Cheers,
Rémi-C


2013/12/13 Misa Simic <misa.simic@gmail.com>
Hi All,

I am not sure how to define with words what I want to accomplish (so can't ask google the right question :) )

So will try to explain with sample data and expected result:

Scenario 1)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A4
51A5
61A6
71A7
81A8
91A9
102A1
112A2
122A3
132A4



Expected result:

thing_idcategoryperiods
1A1-9
2A1-4

(Sounds easy, group by, thing_id, category use Min and Max for period id - but further scenarios makes it a bit complicated...)

Scenario 2)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A4
51B5
61B6
71B7
81A8
91A9
102A1
112A2
122A3
132A4

Expected result:
thing_idcategoryperiods
1A1-4, 8-9
1B5-7
2A1-4

Scenario 3)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A7
51A8
61A9
72A1
82A2
92A3
102A4

Expected result:

thing_idcategoryperiods
1A1-3, 7-9
2A1-4



So goal is, to group by thing_id, category id - but if period_id is interupted (not in incremented by 1) to have aggregated spans...

To desired results we have came up using several CTE's (what makes a query a bit big, and more "procedural way": make cte what calculated diff between current and previous row, next cte uses previous one to define groupings, next cte to make aggregates etc...)

So I wonder - is there some kind of aggregate window function what does desired results?


Many Thanks,

Misa


Re: Grouping, Aggregate, Min, Max

From
Misa Simic
Date:

Thanks Rémi-C,

Well, not sure is it a goal to avoid aggregates... 

Bellow problem/solution even works (not sure) I guess would produce (if we imagine instead of count it use min and max in a row, though this case a bit complicated because of it should take real values from the source table, because of simplicity it starts in examples from 1 - what is not always the case...)

1A1-4
1B5-7
1A8-9

so next step, is to aggregate1-4,8-9 in one row

I think we have used similar approach, what with a few CTE's provides desired result, just think would be simpler with 1 windowed aggregate function.... i.e.

for:
c1, c2
A 1
A 2
A 3

SELECT DISTINCT c1, custom_agg_function(c2) OVER (PARTITION BY c1 ORDER BY c2) 


result
a, 1 -3
 
in case
c1, c2
A 1
A 2
A 3
A 5

result:

A , 1-3, 5-5

thanks,

Misa

2013/12/13 Rémi Cura <remi.cura@gmail.com>
There is a trick to simplify the thing and avoid using aggregates :

Cheers,
Rémi-C


2013/12/13 Misa Simic <misa.simic@gmail.com>
Hi All,

I am not sure how to define with words what I want to accomplish (so can't ask google the right question :) )

So will try to explain with sample data and expected result:

Scenario 1)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A4
51A5
61A6
71A7
81A8
91A9
102A1
112A2
122A3
132A4



Expected result:

thing_idcategoryperiods
1A1-9
2A1-4

(Sounds easy, group by, thing_id, category use Min and Max for period id - but further scenarios makes it a bit complicated...)

Scenario 2)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A4
51B5
61B6
71B7
81A8
91A9
102A1
112A2
122A3
132A4

Expected result:
thing_idcategoryperiods
1A1-4, 8-9
1B5-7
2A1-4

Scenario 3)

idthing_idcategoryperiod_id
11A1
21A2
31A3
41A7
51A8
61A9
72A1
82A2
92A3
102A4

Expected result:

thing_idcategoryperiods
1A1-3, 7-9
2A1-4



So goal is, to group by thing_id, category id - but if period_id is interupted (not in incremented by 1) to have aggregated spans...

To desired results we have came up using several CTE's (what makes a query a bit big, and more "procedural way": make cte what calculated diff between current and previous row, next cte uses previous one to define groupings, next cte to make aggregates etc...)

So I wonder - is there some kind of aggregate window function what does desired results?


Many Thanks,

Misa



Re: Grouping, Aggregate, Min, Max

From
David Johnston
Date:
Re:custom aggregate:

I'd probably try building a two dimensional array in the state transition
function.  Take the new value and check if it is adjacent to the last value
in the last bin of the current state. If so add it to that bin.  If not
create a new bin and store it there.  Requires sorted input.

You could also just store all the values encountered and at the end group
them into bins after sorting internally.  That way you just need to get the
partition right - not the order by and sub-groups.

The final output is just a call to string_agg though you would have to
unnest the array in a custom manner since unnest() flattens
multiple-dimensional arrays.  See a recent thread for specifics.

David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/Grouping-Aggregate-Min-Max-tp5783279p5783318.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: Grouping, Aggregate, Min, Max

From
Kevin Grittner
Date:
Misa Simic <misa.simic@gmail.com> wrote:

> So I wonder - is there some kind of aggregate window function
> what does desired results?

Not built in, but PostgreSQL makes it pretty easy to do so.  With a
little effort to define your own aggregate function, your query can
look like this:

SELECT
    thing_id,
    category,
    int4range_list(period_id)
  FROM thing
  GROUP BY thing_id, category
  ORDER BY 1, 2;

I've attached a couple files -- one which creates the desired
aggregate function, and the other loads a table with two of your
sample data sets and runs the above.  This is just intended as a
quick example of the capabilities available to you.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Grouping, Aggregate, Min, Max

From
Misa Simic
Date:
2013/12/13 Kevin Grittner <kgrittn@ymail.com>
Misa Simic <misa.simic@gmail.com> wrote:

> So I wonder - is there some kind of aggregate window function
> what does desired results?

Not built in, but PostgreSQL makes it pretty easy to do so.  With a
little effort to define your own aggregate function, your query can
look like this:

SELECT
    thing_id,
    category,
    int4range_list(period_id)
  FROM thing
  GROUP BY thing_id, category
  ORDER BY 1, 2;

I've attached a couple files -- one which creates the desired
aggregate function, and the other loads a table with two of your
sample data sets and runs the above.  This is just intended as a
quick example of the capabilities available to you.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Many thanks Kevin! :)

It simply works - perfect! :)

Many thanks,

Misa