Re: [HACKERS] Partition-wise aggregation/grouping - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] Partition-wise aggregation/grouping |
Date | |
Msg-id | CA+TgmoYp5+_VQvt-VTJETTF3PkxnTNJm5PHAdiTtpox8vgqvjA@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Partition-wise aggregation/grouping (Jeevan Chalke <jeevan.chalke@enterprisedb.com>) |
Responses |
Re: [HACKERS] Partition-wise aggregation/grouping
|
List | pgsql-hackers |
On Tue, Mar 6, 2018 at 5:31 AM, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote: > This is in-lined with enable_hashagg GUC. Do you think > enable_partitionwise_aggregate seems better? But it will be not consistent > with other GUC names like enable_hashagg then. Well, if I had my way, enable_hashagg would be spelled enable_hash_aggregate, too, but I wasn't involved in the project that long ago. 100% consistency is hard to achieve here; the perfect parallel of enable_hashagg would be enable_partitionwiseagg, but then it would be inconsistent with enable_partitionwise_join unless we renamed it to enable_partitionwisejoin, which I would rather not do. I think the way the enable_blahfoo names were done was kinda shortsighted -- it works OK as long as blahfoo is pretty short, like mergejoin or hashagg or whatever, but if you have more or longer words then I think it's hard to see where the word boundaries are without any punctuation. And if you start abbreviating then you end up with things like enable_pwagg which are not very easy to understand. So I favor spelling everything out and punctuating it. > So the code for doing partially aggregated partial paths and partially > aggregated non-partial path is same except partial paths goes into > partial_pathlist where as non-partial goes into pathlist of > partially_grouped_rel. Thus, calling add_paths_to_partial_grouping_rel() > when isPartialAgg = true seems correct. Also as we have decided, this > function is responsible to create all partially aggregated paths including > both partial and non-partial. > > Am I missing something? Hmm. I guess not. I think I didn't read this code well enough previously. Please find attached proposed incremental patches (0001 and 0002) which hopefully make the code in this area a bit clearer. >> + /* >> + * If there are any fully aggregated partial paths present, >> may be because >> + * of parallel Append over partitionwise aggregates, we must stick >> a >> + * Gather or Gather Merge path atop the cheapest partial path. >> + */ >> + if (grouped_rel->partial_pathlist) >> >> This comment is copied from someplace where the code does what the >> comment says, but here it doesn't do any such thing. > > Well, these comments are not present anywhere else than this place. With > Parallel Append and Partitionwise aggregates, it is now possible to have > fully aggregated partial paths now. And thus we need to stick a Gather > and/or Gather Merge atop cheapest partial path. And I believe the code does > the same. Am I missing something? I misread the code. Sigh. I should have waited until today to send that email and taken time to study it more carefully. But I still don't think it's completely correct. It will not consider using a pre-sorted path; the only strategies it can consider are cheapest path + Gather and cheapest path + explicit Sort (even if the cheapest path is already correctly sorted!) + Gather Merge. It should really do something similar to what add_paths_to_partial_grouping_rel() already does: first call generate_gather_paths() and then, if the cheapest partial path is not already correctly sorted, also try an explicit Sort + Gather Merge. In fact, it looks like we can actually reuse that logic exactly. See attached 0003 incremental patch; this changes the outputs of one of your regression tests, but the new plan looks better. Some other notes: There's a difference between performing partial aggregation in the same process and performing it in a different process. hasNonPartial tells us that we can't perform partial aggregation *at all*; hasNonSerial only tells us that partial and final aggregation must happen in the same process. This patch could possibly take advantage of partial aggregation even when hasNonSerial is set. Finalize Aggregate -> Append -> N copies of { Partial Aggregate -> Whatever } is OK with hasNonSerial = true as long as hasNonPartial = false. Now, the bad news is that for this to actually work we'd need to define new values of AggSplit, like AGGSPLIT_INITIAL = AGGSPLITOP_SKIPFINAL and AGGSPLIT_FINAL = AGGSPLITOP_COMBINE, and I'm not sure how much complexity that adds. However, if we're not going to do that, I think we'd better at last add some comments about it suggesting that someone might want to do something about it in the future. I think that, in general, it's a good idea to keep the number of times that create_grouping_paths() does something which is conditional on whether child_data is NULL to a minimum. I haven't looked at what Ashutosh tried to do there so I don't know whether it's good or bad, but I like the idea, if we can do it cleanly. It strikes me that we might want to consider refactoring things so that create_grouping_paths() takes the grouping_rel and partial_grouping_rel as input arguments. Right now, the initialization of the child grouping and partial-grouping rels is partly in try_partitionwise_aggregate(), which considers marking one of them (but never both?) as dummy rels and create_grouping_paths() which sets reloptkind, serverid, userid, etc. The logic of all of this is a little unclear to me. Presumably, if the input rel is dummy, then both the grouping_rel and the partial_grouping_rel are also dummy. Also, presumably we should set the reloptkind correctly as soon as we create the rel, not at some later stage. Or maybe what we should do is split create_grouping_paths() into two functions. Like this: if (child_data) { partial_grouping_target = child_data->partialRelTarget; partially_grouped_rel->reltarget = partial_grouping_target; agg_partial_costs = child_data->agg_partial_costs; agg_final_costs = child_data->agg_final_costs; } --- SPLIT IT HERE --- /* Apply partitionwise aggregation technique, if possible. */ try_partitionwise_grouping(root, input_rel, grouped_rel, partially_grouped_rel, target, partial_grouping_target, agg_costs, agg_partial_costs, agg_final_costs, gd, can_sort, can_hash, havingQual, isPartialAgg); It seems to me that everything from that point to the end is doing the path generation and it's all pretty much the same for the parent and child cases. But everything before that is either stuff that doesn't apply to the child case at all (like the degenerate grouping case) or stuff that should be done once and passed down (like can_sort/can_hash). The only exception I see is some of the stuff that sets up the upper rel at the top of the function, but maybe that logic could be refactored into a separate function as well (like initialize_grouping_rel). Then, instead of try_partitionwise_join() actually calling create_grouping_paths(), it would call initialize_grouping_rel() and then the path-adding function that we split off from the bottom of the current create_grouping_paths(), basically skipping all that stuff in the middle that we don't really want to do in that case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: