Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date
Msg-id 162867790905100621t3d4e2c94qfa7bfb895260981f@mail.gmail.com
Whole thread Raw
In response to Implementation of GROUPING SETS (T431: Extended grouping capabilities)  (Олег Царев <zabivator@gmail.com>)
Responses Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)  (Олег Царев <zabivator@gmail.com>)
List pgsql-hackers
Hello

some other info is on  http://wiki.postgresql.org/wiki/Grouping_Sets

Maybe in e few weak I'll have some a prototype based on CTE. My older
technique based on hashed tables should be well, but it carry lot of
unshared code. Using CTE means so we can optimize CTE and GROUPING
SETS together. I plan to transform query

SELECT * FROM some GROUP BY GROUPING SETS(a, b)

to

WITH q AS (SELECT * FROM some) SELECT * FROM q GROUP BY a UNION ALL SELECT * FROM q GROUP BY b


2009/5/10 Олег Царев <zabivator@gmail.com>:
> Hello all.
> Please, approve my ideas for implementation.
>
> Standart has feature T431: Extended grouping capabilities.
> This feature i found in TODO-list:
> http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DO
>
> MS SQL 2005 partial support this feature:
> http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
> http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspx
>
> MS SQL 2008 support this feature:
> http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspx
>
> Oracle support this feature:
> http://www.compshack.com/sql/oracle-group-rollup
>
> So, it's short notes about GROUPING SETS, but more complete
> information have in a official documentation of MS SQL and Oracle
> (copyright limited for send as attach).
>
> First. GROUPG SETS.
>
> select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
> () ) - it's example of use grouping sets.
> Semantic of this construction - make group by over source more, than
> one group of column.
> It's very wide key - A,B C. In result set of this example we can find
> result set of select   select A,B,C,SUM(D) from table group by A,B,C -
> as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
> SETS( (A,B,C), (A), () )
> Two subset - is GROUP BY A B, and instead C column we look NULL.
> Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
> "GRAND TOTAL". - calculate over all subset without grouping
>
> Also have function "GROUPING"  it's function say about null - "real
> null" (from table) or generated by "GROUP BY GROUPING SETS"
>
> My point: this feature can implement over GROUP BY and UNION ALL
> We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
> )" to  select A,B,C fron table GROUP BY A,B,C .UNION  ALL select
> A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
> group by();
>
> So, it's very simple, don't require modification of executor and
> callibrate cost - only parser and semantic anylysis,
> '
> So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
> (A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
> CUBE - analogue.
>
> If this idea it's good -  i can write code base on old patch
> http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
> from clean list (as you wish).

This is suboptimal. When SELECT * FROM X GROUP BY GROUPING SETS ...,
where X is some joined data, then you repeat JOIN on every grouping
set. So your solution is simple for implementation, but it should be
really slow.

Regards
Pavel Stehule

>
> In future i know how to implement ROLLUP more optimal (executor
> iterator) and use this ROLLUP for optimisation another GROUP BY,
> GROUPING SETS.
>
> Thanks.
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


pgsql-hackers by date:

Previous
From: Олег Царев
Date:
Subject: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Next
From: Andrew Dunstan
Date:
Subject: Re: SQL state in log_line_prefix