> it seems that the longest GROUPING SET and all its left-continuous
> subsets could be collected from the sorted scan and the rest from hash
> aggregates.
>
> GROUPING SET () will always need a "hash" ;)
>
> To optimise any further would require use of statistics data, and is
> probably not a good idea to do before having the simpler one implemented
Absolutely right. That's a good starting point.
> > Any ORDER BY in the query should
> > really be applied after the grouping operation.
> >
> > The CUBE and ROLLUP operators should really be applied by expanding them
> > into the equivalent collections of grouping sets.
>
> For pure ROLLUP one could shortcut the split-into-groups and
> put-together-again process, as ROLLUP is already doable from single
> sorted scan.
Actually as long as the grouping sets are all left-continuous of the longest
grouping set it's doable from a single sorted scan. If done with the right
implementation separating resetable aggregators and out of order aggregators
you could get this optimization for free. This avoids having to look for
ROLLUP specifically.
Cheers,
Robert