On Fri, Mar 16, 2018 at 1:50 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Ok. That looks good.
Here's an updated version. In this version, based on a voice
discussion with Ashutosh and Jeevan, I adjusted 0001 to combine it
with an earlier idea of splitting Gather/Gather Merge path generation
out of the function that creates partially aggregated paths. The idea
here is that create_ordinary_gather_paths() could first call
create_partial_grouping_paths(), then add additional paths which might
be partial or non-partial by invoking the partition-wise aggregate
logic, then call gather_grouping_paths() and set_cheapest() to
finalize the partially grouped rel. Also, I added draft commit
messages.
With this patch set applied, the key bit of logic in
create_ordinary_grouping_paths() ends up looking like this:
if (grouped_rel->consider_parallel && input_rel->partial_pathlist != NIL
&& (flags & GROUPING_CAN_PARTIAL_AGG) != 0)
{
partially_grouped_rel =
create_partial_grouping_paths(root,
grouped_rel,
input_rel,
gd,
can_sort,
can_hash,
&agg_final_costs);
gather_grouping_paths(root, partially_grouped_rel);
set_cheapest(partially_grouped_rel);
}
I imagine that what the main partition-wise aggregate patch would do
is (1) change the conditions under which
create_partial_grouping_paths() gets called, (2) postpone
gather_grouping_paths() and set_cheapest() until after partition-wise
aggregate had been done, doing them only if partially_grouped_rel !=
NULL. Partition-wise aggregate will need to happen before
add_paths_to_grouping_rel(), though, so that the latter function can
try a FinalizeAggregate node on top of an Append added by
partition-wise aggregate.
This is a bit strange, because it will mean that partition-wise
aggregate will be attempted BEFORE adding ordinary aggregate paths to
grouped_rel but AFTER adding them to partially_grouped_rel. We could
fix that by splitting add_paths_to_grouping_rel() into two functions,
one of which performs full aggregation directly and the other of which
tries finishing partial aggregation. I'm unsure that's a good idea
though: it would mean that we have very similar logic in two different
functions that could get out of sync as a result of future code
changes, and it's not really fixing any problem.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company