Re: speeding up planning with partitions - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: speeding up planning with partitions |
Date | |
Msg-id | 19f54c17-1619-b228-10e5-ca343be6a4e8@lab.ntt.co.jp Whole thread Raw |
In response to | Re: speeding up planning with partitions (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: speeding up planning with partitions
|
List | pgsql-hackers |
Thanks a lot for reviewing. On 2019/03/23 6:07, Tom Lane wrote: > Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes: >> [ v34 patch set ] > > I had a bit of a look through this. I went ahead and pushed 0008 and > 0009, as they seem straightforward and independent of the rest. (BTW, > 0009 makes 0003's dubious optimization in set_relation_partition_info > quite unnecessary.) As for the rest: > > > 0001: OK in principle, but I wonder why you implemented it by adding > another recursive scan of the jointree rather than just iterating > over the baserel array, as in make_one_rel() for instance. Seems > like this way is more work, and more code that we'll have to touch > if we ever change the jointree representation. I've changed this to work by iterating over baserel array. I was mostly worried about looping over potentially many elements of baserel array that we simply end up skipping, but considering the other patch that delays adding inheritance children, we don't need to worry about that. > I also feel like you used a dartboard while deciding where to insert the > call in query_planner(); dropping it into the middle of a sequence of > equivalence-class-related operations seems quite random. Maybe we could > delay that step all the way to just before make_one_rel, since the other > stuff in between seems to only care about baserels? For example, > it'd certainly be better if we could run remove_useless_joins before > otherrel expansion, so that we needn't do otherrel expansion at all > on a table that gets thrown away as being a useless join. create_lateral_join_info() expects otherrels to be present to propagate the information it creates. I have moved add_other_rels_to_query() followed by create_lateral_join_info() to just before make_one_rel(). > BTW, it strikes me that we could take advantage of the fact that > baserels must all appear before otherrels in the rel array, by > having loops over that array stop early if they're only interested > in baserels. We could either save the index of the last baserel, > or just extend the loop logic to fall out upon hitting an otherrel. > Seems like this'd save some cycles when there are lots of partitions, > although perhaps loops like that would be fast enough to not matter. As Imai-san's testing shows, there's certainly a slight speedup to be expected. If we want that, maybe we could use his patch. > 0002: I seriously dislike this from a system-structure viewpoint. > For one thing it makes root->processed_tlist rather moot, since it > doesn't get set till after most of the work is done. (I don't know > if there are any FDWs out there that look at processed_tlist, but > they'd be broken by this if so. It'd be better to get rid of the > field if we can, so that at least such breakage is obvious. Or, > maybe, use root->processed_tlist as the communication field rather > than having the tlist be a param to query_planner?) Getting rid of processed_tlist altogether seems rather daunting to me, so I've adopted your suggestion of adding any new junk TLEs to processed_tlist directly. > I also > don't like the undocumented way that you've got build_base_rel_tlists > working on something that's not the final tlist, and then going back > and doing more work of the same sort later. I wonder whether we can > postpone build_base_rel_tlists until after the other stuff happens, > so that it can continue to do all of the tlist-distribution work. Seeing your other reply to this part, I withdraw 0002 as previously proposed. Instead, I modified 0003 to teach expand_inherited_rtentry() to add "wholerow" junk TLE if any foreign table children need it, and a "tableoid" junk TLE needed for the inheritance case. It also calls add_vars_to_targetlist() directly to add those vars to parent's reltaget. The newly added TLEs are added directly to processed_tlist. "ctid" junk TLE is added by preprocess_targetlist() as before. Maybe, we can remove the code in preprocess_targetlist() for adding "wholerow" and "tableoid" junk TLEs, as it's essentially dead code after this patch. > 0003: I haven't studied this in great detail, but it seems like there's > some pretty random things in it, like this addition in > inheritance_planner: > > + /* grouping_planner doesn't need to see the target children. */ > + subroot->append_rel_list = list_copy(orig_append_rel_list); > > That sure seems like a hack unrelated to the purpose of the patch ... and > since list_copy is a shallow copy, I'm not even sure it's safe. Won't > we be mutating the contained (and shared) AppendRelInfos as we go along? Sorry, the comment wasn't very clear. As for how this is related to this patch, we need subroot to have its own append_rel_list, because query_planner() will add new entries to it if there are any inheritance parents that are source relations. We wouldn't want them to be added to the parent root's append_rel_list, because the next child will want to start with same append_rel_list in its subroot as the first child. As for why orig_append_rel_list and not parent_root->append_rel_list, the latter contains appinfos for target child relations, which need not be visible to query_planner(). In fact, all loops over append_rel_list during query planning simply ignore those appinfos, because their parent relation is not part of the translated query's join tree. Indeed, we can do copyObject(orig_append_rel_list) and get rid of some instances of code specific to subqueryRTindexes != NULL. The first block of such code makes copies of appinfos that reference subquery RTEs, which can simply be deleted because orig_append_rel_list contains only the appinfos pertaining to flattened UNION ALL. The second block applies ChangeVarNodes() to appinfos that reference subquery RTEs, necessitating copying in the first place, which currently loops over subroot->append_rel_list to filter out those that don't need to be changed. If subroot->append_rel_list is a copy of orig_append_rel_list, this filtering is unnecessary, so we can simply do: ChangeVarNodes((Node *) subroot->append_rel_list, rti, newrti, 0); I've made the above changes and updated the comment to reflect that. > 0004 and 0005: I'm not exactly thrilled about loading more layers of > overcomplication into inheritance_planner, especially not when the > complication then extends out into new global planner flags with > questionable semantics. We should be striving to get rid of that > function, as previously discussed, and I fear this is just throwing > more roadblocks in the way of doing so. Can we skip these and come > back to the question after we replace inheritance_planner with > expand-at-the-bottom? As you know, since we're changing things so that source inheritance is expanded in query_planner(), any UPDATE/DELETE queries containing inheritance parents as source relations will regress to some degree, because source inheritance expansion will be repeated for every child query. I don't like the new flag contains_inherit_children either, because it will be rendered unnecessary the day we get rid of inheritance_planner, but I don't see any other way to get what we want. If we're to forego 0004 because of that complexity, at least we should consider 0005, because its changes are fairly local to inheritance_planner(). The speedup and savings in memory consumption by avoiding putting target child RTEs in the child query are significant, as discussed upthread. I have moved 0004 to be the last patch in the series to make way for other simpler patches to be considered first. > 0006: Seems mostly OK, but I'm not very happy about the additional > table_open calls it's throwing into random places in the planner. > Those can be rather expensive. Why aren't we collecting the appropriate > info during the initial plancat.c consultation of the relcache? Hmm, you're seeing those because we're continuing to use the old internal interfaces of inherit.c. Especially, with the existing interfaces, we need both the parent and child relations to be open to build the AppendRelInfo. Note that we are using the same code for initializing both target child relations and source child relations, and because we don't have RelOptInfos available in the former case, we can't change the internal interfaces to use RelOptInfos for passing information around. Previous versions of this patch did add TupleDesc and Oid fields to RelOptInfo to store a relation's rd_att and rd_rel->reltype, respectively, that are needed to construct AppendRelInfo. It also performed massive refactoring of the internal interfaces of inherit.c to work by passing around parent RelOptInfo, also considering that one caller (inheritance_planner) doesn't build RelOptInfos before calling. But I thought all that refactoring would be a harder sell than adding a table_open() call in joinrels.c to be able to call make_append_rel_info() directly for building what's really a no-op AppendRelInfo. > 0007: Not really sold on this; it seems like it's going to be a net > negative for cases where there aren't a lot of partitions. Performance loss for smaller number of partitions is in the noise range, but what we gain for large number of partitions seems pretty significant to me: nparts no live_parts live_parts ====== ============= ========== 2 3397 3391 8 3365 3337 32 3316 3379 128 3338 3399 512 3273 3321 1024 3439 3517 4096 3113 3227 8192 2849 3215 Attached find updated patches. 0002 is a new patch to get rid of the duplicate RTE and PlanRowMark that's created for partitioned parent table, as it's pointless. I was planning to propose it later, but decided to post it now if we're modifying the nearby code anyway. Thanks, Amit
Attachment
- v35-0001-Build-other-rels-of-appendrel-baserels-in-a-sepa.patch
- v35-0002-Get-rid-of-duplicate-child-RTE-for-partitioned-t.patch
- v35-0003-Delay-adding-inheritance-child-tables-until-quer.patch
- v35-0004-Perform-pruning-in-expand_partitioned_rtentry.patch
- v35-0005-Teach-planner-to-only-process-unpruned-partition.patch
- v35-0006-Fix-inheritance_planner-to-avoid-useless-work.patch
- v35-0007-Adjust-inheritance_planner-to-reuse-source-child.patch
pgsql-hackers by date: