Implementation of GROUPING SETS (T431: Extended grouping capabilities) - Mailing list pgsql-hackers

From Олег Царев
Subject Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date
Msg-id 54f48e4f0905100557r216dbd20j2ceb1e99d7c39004@mail.gmail.com
Whole thread Raw
Responses Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
List pgsql-hackers
Hello all.
Please, approve my ideas for implementation.

Standart has feature T431: Extended grouping capabilities.
This feature i found in TODO-list:
http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DO

MS SQL 2005 partial support this feature:
http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx
http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspx

MS SQL 2008 support this feature:
http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspx

Oracle support this feature:
http://www.compshack.com/sql/oracle-group-rollup

So, it's short notes about GROUPING SETS, but more complete
information have in a official documentation of MS SQL and Oracle
(copyright limited for send as attach).

First. GROUPG SETS.

select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A),
() ) - it's example of use grouping sets.
Semantic of this construction - make group by over source more, than
one group of column.
It's very wide key - A,B C. In result set of this example we can find
result set of select   select A,B,C,SUM(D) from table group by A,B,C -
as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING
SETS( (A,B,C), (A), () )
Two subset - is GROUP BY A B, and instead C column we look NULL.
Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name
"GRAND TOTAL". - calculate over all subset without grouping

Also have function "GROUPING"  it's function say about null - "real
null" (from table) or generated by "GROUP BY GROUPING SETS"

My point: this feature can implement over GROUP BY and UNION ALL
We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),()
)" to  select A,B,C fron table GROUP BY A,B,C .UNION  ALL select
A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table
group by();

So, it's very simple, don't require modification of executor and
callibrate cost - only parser and semantic anylysis,
'
So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS(
(A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ),
CUBE - analogue.

If this idea it's good -  i can write code base on old patch
http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or
from clean list (as you wish).

In future i know how to implement ROLLUP more optimal (executor
iterator) and use this ROLLUP for optimisation another GROUP BY,
GROUPING SETS.

Thanks.


pgsql-hackers by date:

Previous
From: Guillaume Smet
Date:
Subject: Re: SQL state in log_line_prefix
Next
From: Pavel Stehule
Date:
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)