>>>>> "Tomas" == Tomas Vondra <tv@fuzzy.cz> writes:
Tomas> I have significant doubts about the whole design,Tomas> though. Especially the decision not to use
HashAggregate,
There is no "decision not to use HashAggregate". There is simply no
support for HashAggregate yet.
Having it be able to work with GroupAggregate is essential, because
there are always cases where HashAggregate is simply not permitted
(e.g. when using distinct or sorted aggs; or unhashable types; or with
the current code, when the estimated memory usage exceeds work_mem).
HashAggregate may be a performance improvement, but it's something
that can be added afterwards rather than an essential part of the
feature.
Tomas> Now, the chaining only makes this worse, because itTomas> effectively forces a separate sort of the whole table
foreachTomas> grouping set.
It's not one sort per grouping set, it's the minimal number of sorts
needed to express the result as a union of ROLLUP clauses. The planner
code will (I believe) always find the smallest number of sorts needed.
Each aggregate node can process any number of grouping sets as long as
they represent a single rollup list (and therefore share a single sort
order).
Yes, this is slower than using one hashagg. But it solves the general
problem in a way that does not interfere with future optimization.
(HashAggregate can be added to the current implementation by first
adding executor support for hashagg with multiple grouping sets, then
in the planner, extracting as many hashable grouping sets as possible
from the list before looking for rollup lists. The chained aggregate
code can work just fine with a HashAggregate as the chain head.
We have not actually tackled this, since I'm not going to waste any
time adding optimizations before the basic idea is accepted.)
Tomas> What I envisioned when considering hacking on this a fewTomas> months back, was extending the aggregate API with
"mergeTomas>state" function,
That's not really on the cards for arbitrary non-trivial aggregate
functions.
Yes, it can be done for simple ones, and if you want to use that as a
basis for adding optimizations that's fine. But a solution that ONLY
works in simple cases isn't sufficient, IMO.
--
Andrew (irc:RhodiumToad)