Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) - Mailing list pgsql-hackers

From Hitoshi Harada
Subject Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date
Msg-id e08cc0400908130907ya1d1902p77c533f3a6b1066f@mail.gmail.com
Whole thread Raw
In response to Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)  (Олег Царев <zabivator@gmail.com>)
Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
2009/8/8 Alvaro Herrera <alvherre@commandprompt.com>:
> Олег Царев escribió:
>> Hello all!
>> If no one objecte (all agree, in other say) i continue work on patch -
>> particulary, i want support second strategy (tuple store instead of
>> hash-table) for save order of source (more cheap solution in case with
>> grouping sets + order by), investigate and brainstorm another
>> optimisation, writing regression tests and technical documentation.
>> But I need some time for complete my investigation internals of
>> PostgreSQL, particulary CTE.
>
> Where are we on this patch?  Is it moving forward?
>

It seems to me that the patch goes backward.

I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.

If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.

Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.

When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:

select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by b

so nothing better than query rewriting unless we invent something new.

But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.

Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?


Regards,


--
Hitoshi Harada


pgsql-hackers by date:

Previous
From: Boszormenyi Zoltan
Date:
Subject: Re: DECLARE doesn't set/reset sqlca after DECLARE cursor
Next
From: Олег Царев
Date:
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)