Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my |
Date | |
Msg-id | 23034.1331834451@sss.pgh.pa.us Whole thread Raw |
In response to | Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: EquivalenceClasses and subqueries and PlaceHolderVars, oh my
|
List | pgsql-hackers |
I wrote: > Yeb Havinga <yebhavinga@gmail.com> writes: >> I'm having a hard time imagining that add_child_rel_equivalences is not >> just plain wrong. Even though it will only add child equivalence members >> to a parent eq class when certain conditions are met, isn't it the case >> that since a union (all) is addition of tuples and not joining, any kind >> of propagating restrictions on a append rel child member to other areas >> of the plan can cause unwanted results, like the ones currently seen? > None of the known problems are the fault of that, really. The child > entries don't cause merging of ECs, which would be the only way that > they'd affect the semantics of the query at large. So in that sense > they are not really members of the EC but just some auxiliary entries > that ease figuring out whether a child expression matches an EC. After further thought about that, I've concluded that indeed my patch 57664ed25e5dea117158a2e663c29e60b3546e1c was just plain wrong, and Teodor was more nearly on the right track than I was in the original discussion. If child EC members aren't full-fledged members then there's no a-priori reason why they need to be distinct from each other. There are only a few functions that actually match anything to child members (although there are some others that could use Asserts or tests to make it clearer that they aren't paying attention to child members). AFAICT, if we don't try to enforce uniqueness of child members, the only consequences will be: (1) It'll be order-dependent which EquivalenceClass a child index column is thought to match. As I explained earlier, this is not really the fault of this representational detail, but is a basic shortcoming of the whole current concept of ECs. Taking the first match is fine for now. (2) It'll be unclear which of several identical subplan output columns should be sorted by in prepare_sort_from_pathkeys. Now ordinarily that does not particularly matter --- if you have multiple identical nonvolatile expressions, you can take any one (and we already have a hack in there for the volatile case). I think it *only* matters for MergeAppend, where we need to be sure that the sort column locations match across all the children. However, we can fix that in some localized way instead of screwing up ECs generally. The idea I have in mind at the moment, since create_merge_append_plan already starts by determining the sort column locations for the MergeAppend itself, is to pass down that info to the calls for the child plans and insist that we match to the same column locations we found for the parent MergeAppend. So I now propose reverting the earlier two patches (but not their regression test cases of course) and instead hacking MergeAppend plan building as per (2). regards, tom lane
pgsql-hackers by date: