Julien Rouhaud <rjuju123@gmail.com> writes:
> On Thu, Mar 7, 2019 at 4:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> One more thing --- after sleeping in it, I'm questioning my earlier
>> feeling that apply_scanjoin_target_to_paths should flush existing paths
>> for partitioned rels. Maybe it was done like that with the idea of
>> letting paths with tlist computations done above the Append compete
>> with paths doing the tlist computations below? That would be a fine
>> idea if we had any costing factors that would produce a meaningful
>> choice between the two; but I'm afraid that what we're probably getting
>> right now is a quasi-random choice between paths whose costs differ
>> only by rounding errors.
> I don't know and the comments surely didn't mention that. But since
> we're trying hard to speed up everything for high number of partitions
> it seems like a good idea to avoid wasting cycles here if there's no
> clear benefit.
I stuck in some debugging printouts, and it seems that at least for
cases we exercise in the regression tests, the newly-made paths are
either the same cost or cheaper than the modified old ones. Probably
that corresponds exactly to whether or not an extra Result node is
needed to evaluate the tlist when we don't push it down, but I did
not try to verify that. Anyway it seems pointless to do the extra
work, so I left the logic as I had it, with some extra commentary
about this.
I could imagine that in future we might have some costing considerations
that would make this a more interesting choice, in which case we can
just delete a few lines and do the extra cost comparisons. (One thought
that comes to mind right away is the likely extra cost of JIT'ing multiple
copies of basically the same tlist expressions.)
It's also occurred to me to make is_dummy_rel look at
linitial(rel->pathlist) rather than assuming cheapest_total_path is
valid. In that way, it will give a correct answer whether or not
we've done set_cheapest() lately. It was already a pretty large
leap of faith for apply_scanjoin_target_to_paths to invoke a lot of
other code while it knew perfectly well that is_dummy_rel would be
giving a wrong answer. Someday, something in those other functions
might try to check that; or if we didn't blow it here, we would
someplace else, given that IS_PARTITIONED_REL() is likely to get
called in more and more places.
regards, tom lane