Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) - Mailing list pgsql-hackers
From | Олег Царев |
---|---|
Subject | Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) |
Date | |
Msg-id | 54f48e4f0905121513n840a44dredb8ccf56317ab7@mail.gmail.com Whole thread Raw |
In response to | Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: Implementation of GROUPING SETS (T431: Extended
grouping capabilities)
(Alvaro Herrera <alvherre@commandprompt.com>)
|
List | pgsql-hackers |
Hello all! If no one objecte (all agree, in other say) i continue work on patch - particulary, i want support second strategy (tuple store instead of hash-table) for save order of source (more cheap solution in case with grouping sets + order by), investigate and brainstorm another optimisation, writing regression tests and technical documentation. But I need some time for complete my investigation internals of PostgreSQL, particulary CTE. Thanks. 2009/5/13 Pavel Stehule <pavel.stehule@gmail.com>: > Hello Oleg > > I am sending a new CTE based variant of my GROUPING SETS patch, > > this patch has some bugs but it is good prototype (it's more stable > than old patch): > > postgres=# select selling_date, baguette, sum(items) from > baguette_selling group by grouping sets(1,2); > selling_date | baguette | sum > --------------+----------+----- > 2007-10-30 | | 17 > 2007-10-31 | | 12 > | golf | 9 > | buster | 20 > (4 rows) > > postgres=# select selling_date, baguette, sum(items), > grouping(selling_date), grouping(baguette), grouping_id(selling_date, > baguette) from baguette_selling group by grouping sets(1,2); > selling_date | baguette | sum | grouping | grouping | grouping_id > --------------+----------+-----+----------+----------+------------- > 2007-10-30 | | 17 | 1 | 0 | 2 > 2007-10-31 | | 12 | 1 | 0 | 2 > | golf | 9 | 0 | 1 | 1 > | buster | 20 | 0 | 1 | 1 > (4 rows) > > postgres=# select selling_date, baguette, sum(items), > grouping(selling_date), grouping(baguette), grouping_id(selling_date, > baguette) from baguette_selling group by grouping sets(1,2,()); > selling_date | baguette | sum | grouping | grouping | grouping_id > --------------+----------+-----+----------+----------+------------- > 2007-10-30 | | 17 | 1 | 0 | 2 > 2007-10-31 | | 12 | 1 | 0 | 2 > | golf | 9 | 0 | 1 | 1 > | buster | 20 | 0 | 1 | 1 > | | 29 | 0 | 0 | 0 > (5 rows) > > I thing so parser part is well and correct (and ported to 8.4). > > CTE works well, but not 100% effective, and will be better to use > direct tuplestore interface (as second technique - when hash tables > can't to be used). > > I am thinking, so the best solution is enhancing current Aggregate > node for support of GroupingSets. The code base on UNION ALL is +/- > equal to CTE, and I don't thing, so this should be optimal. But work > freely, please. I have not free time for this patch next two months. > So if you have time, it's your. > > regards > Pavel Stehule > > > > 2009/5/10 Олег Царев <zabivator@gmail.com>: >> Hello all. >> Please, approve my ideas for implementation. >> >> Standart has feature T431: Extended grouping capabilities. >> This feature i found in TODO-list: >> http://wiki.postgresql.org/wiki/Todo -> SQL Commands -> TO DO >> >> MS SQL 2005 partial support this feature: >> http://www.kodyaz.com/articles/sql-server-2005-cube-rollup-cannot-compute-distinct-aggregates.aspx >> http://blogs.msdn.com/craigfr/archive/2007/09/21/aggregation-with-rollup.aspx >> >> MS SQL 2008 support this feature: >> http://blogs.msdn.com/craigfr/archive/2007/10/11/grouping-sets-in-sql-server-2008.aspx >> >> Oracle support this feature: >> http://www.compshack.com/sql/oracle-group-rollup >> >> So, it's short notes about GROUPING SETS, but more complete >> information have in a official documentation of MS SQL and Oracle >> (copyright limited for send as attach). >> >> First. GROUPG SETS. >> >> select A,B,C,SUM(D) from table group by GROUPING SETS( (A,B,C), (A), >> () ) - it's example of use grouping sets. >> Semantic of this construction - make group by over source more, than >> one group of column. >> It's very wide key - A,B C. In result set of this example we can find >> result set of select select A,B,C,SUM(D) from table group by A,B,C - >> as subset. It's mind: "GROUP BY A,B,C" - subset of "GROUP BY GROUPING >> SETS( (A,B,C), (A), () ) >> Two subset - is GROUP BY A B, and instead C column we look NULL. >> Third subset - GROUP BY (), instead A,B,C - NULL, one row - it's name >> "GRAND TOTAL". - calculate over all subset without grouping >> >> Also have function "GROUPING" it's function say about null - "real >> null" (from table) or generated by "GROUP BY GROUPING SETS" >> >> My point: this feature can implement over GROUP BY and UNION ALL >> We can make decomposition of "GROUP BY GROUPING SETS( (A,B,C),(A),() >> )" to select A,B,C fron table GROUP BY A,B,C .UNION ALL select >> A,B,NULL from table group BY A,B UNION ALL NUll,NUll,NULL from table >> group by(); >> >> So, it's very simple, don't require modification of executor and >> callibrate cost - only parser and semantic anylysis, >> ' >> So, ROLLUP(A1,...An) is alias to "GROUP BY GROUPING SETS( >> (A1,...,An),(A1,...,An-1),... (An-1,An),(An),() ), >> CUBE - analogue. >> >> If this idea it's good - i can write code base on old patch >> http://archives.postgresql.org/pgsql-hackers/2008-10/msg00838.php or >> from clean list (as you wish). >> >> In future i know how to implement ROLLUP more optimal (executor >> iterator) and use this ROLLUP for optimisation another GROUP BY, >> GROUPING SETS. >> >> Thanks. >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >> >
pgsql-hackers by date: