Home > mailing lists

Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) - Mailing list pgsql-hackers

From	Hitoshi Harada
Subject	Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
Date	August 13, 2009 13:08:28
Msg-id	e08cc0400908130907ya1d1902p77c533f3a6b1066f@mail.gmail.com Whole thread Raw
In response to	Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) (Alvaro Herrera <alvherre@commandprompt.com>)
Responses	Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)
List	pgsql-hackers

Tree view

2009/8/8 Alvaro Herrera <alvherre@commandprompt.com>:
> Олег Царев escribió:
>> Hello all!
>> If no one objecte (all agree, in other say) i continue work on patch -
>> particulary, i want support second strategy (tuple store instead of
>> hash-table) for save order of source (more cheap solution in case with
>> grouping sets + order by), investigate and brainstorm another
>> optimisation, writing regression tests and technical documentation.
>> But I need some time for complete my investigation internals of
>> PostgreSQL, particulary CTE.
>
> Where are we on this patch?  Is it moving forward?
>

It seems to me that the patch goes backward.

I looked trough the gsets-0.6.diff for about an hour, and found it is
now only a syntax sugar that builds multiple GROUP BY queries based on
CTE functionality. There's no executor modification.

If I remember correctly, the original patch touched executor parts.
I'd buy if the GROUPING SETS touches executor but I don't if this is
only syntax sugar, because you can write it as the same by yourself
without GROUPING SETS syntax. The motivation we push this forward is
performance that cannot be made by rewriting query, I guess.

Because GROUP BY we have today is a subset of GROUPING SETS by
definition, I suppose we'll refactor nodeAgg.c so that it is allowed
to take multiple group definitions. And we must support both of
HashAgg and GroupAgg. For HashAgg, it is easier in any case as the
earlier patch does. For GroupAgg, it is a bit complicated since we
sort by different key sets.

When we want GROUPING SET(a, b), at first we sort by a and aggregate
then sort by b and aggregate. This is the same as:

select a, null, count(*) from x group by a
union all
select null, b, count(*) from x group by b

so nothing better than query rewriting unless we invent something new.

But in case of sub total and grand total like ROLLUP query, GroupAgg
can do it by one-time scan by having multiple life cycle PerGroup
state.

Anyway, before going ahead we need to find rough sketch of how to
implement this feature. Only syntax sugar is acceptable? Or internal
executor support is necessary?


Regards,


--
Hitoshi Harada

pgsql-hackers by date:

From: Boszormenyi Zoltan
Date: 13 August 2009, 12:56:06
Subject: Re: DECLARE doesn't set/reset sqlca after DECLARE cursor

From: Олег Царев
Date: 13 August 2009, 13:22:37
Subject: Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities)

Re: Implementation of GROUPING SETS (T431: Extended grouping capabilities) - Mailing list pgsql-hackers

Previous

Next