Home > mailing lists

Re: Final Patch for GROUPING SETS - unrecognized node type: 347 - Mailing list pgsql-hackers

From	Andrew Gierth
Subject	Re: Final Patch for GROUPING SETS - unrecognized node type: 347
Date	September 6, 2014 21:34:29
Msg-id	874mwka4hd.fsf@news-spur.riddles.org.uk Whole thread Raw
In response to	Re: Final Patch for GROUPING SETS - unrecognized node type: 347 (Tomas Vondra <tv@fuzzy.cz>)
Responses	Re: Final Patch for GROUPING SETS - unrecognized node type: 347
List	pgsql-hackers

Tree view

>>>>> "Tomas" == Tomas Vondra <tv@fuzzy.cz> writes:
Tomas> I have significant doubts about the whole design,Tomas> though. Especially the decision not to use
HashAggregate,

There is no "decision not to use HashAggregate". There is simply no
support for HashAggregate yet.

Having it be able to work with GroupAggregate is essential, because
there are always cases where HashAggregate is simply not permitted
(e.g. when using distinct or sorted aggs; or unhashable types; or with
the current code, when the estimated memory usage exceeds work_mem).
HashAggregate may be a performance improvement, but it's something
that can be added afterwards rather than an essential part of the
feature.
Tomas> Now, the chaining only makes this worse, because itTomas> effectively forces a separate sort of the whole table
foreachTomas> grouping set.

It's not one sort per grouping set, it's the minimal number of sorts
needed to express the result as a union of ROLLUP clauses. The planner
code will (I believe) always find the smallest number of sorts needed.

Each aggregate node can process any number of grouping sets as long as
they represent a single rollup list (and therefore share a single sort
order).

Yes, this is slower than using one hashagg. But it solves the general
problem in a way that does not interfere with future optimization.

(HashAggregate can be added to the current implementation by first
adding executor support for hashagg with multiple grouping sets, then
in the planner, extracting as many hashable grouping sets as possible
from the list before looking for rollup lists. The chained aggregate
code can work just fine with a HashAggregate as the chain head.

We have not actually tackled this, since I'm not going to waste any
time adding optimizations before the basic idea is accepted.)
Tomas> What I envisioned when considering hacking on this a fewTomas> months back, was extending the aggregate API with
"mergeTomas>state" function,

That's not really on the cards for arbitrary non-trivial aggregate
functions.

Yes, it can be done for simple ones, and if you want to use that as a
basis for adding optimizations that's fine. But a solution that ONLY
works in simple cases isn't sufficient, IMO.

--
Andrew (irc:RhodiumToad)

pgsql-hackers by date:

From: Tomas Vondra
Date: 06 September 2014, 20:04:57
Subject: Re: Final Patch for GROUPING SETS - unrecognized node type: 347

From: Tom Lane
Date: 06 September 2014, 21:39:37
Subject: Re: Patch for psql History Display on MacOSX

Re: Final Patch for GROUPING SETS - unrecognized node type: 347 - Mailing list pgsql-hackers

Previous

Next