On Wed, 20 Jul 2022 at 21:19, Richard Guo <guofenglinux@gmail.com> wrote: > So the idea is if the ECs used by the mergeclauses are prefix of the > query_pathkeys, we use this prefix as pathkeys for the mergejoin. Why > not relax this further that if the ECs in the mergeclauses and the > query_pathkeys have common prefix, we use that prefix as pathkeys? So > that we can have a plan like below:
I don't think that's a clear-cut win. There is scoring code in there to try to arrange the pathkey list in the order of most-useful-to-upper-level-joins firsts. If we were to do as you describe we could end up generating worse plans when there is some subsequent Merge Join above this one that has join conditions that the query_pathkeys are not compatible with.
Yeah, you're right. Although we would try different permutation of the pathkeys in sort_inner_and_outer() but that does not cover every possible ordering due to cost consideration. So we still need to respect the heuristics behind the pathkey order returned by this function, which is the scoring logic trying to list most-useful-to-upper-level-joins keys earlier.
Maybe your idea could be made to work in cases where bms_equal(joinrel->relids, root->all_baserels). In that case, we should not be processing any further joins and don't need to consider that as a factor for the scoring.
That should work, as long as this case is common enough to worth we writing the codes.