On Thu, 4 Jan 2024 at 00:01, Richard Guo <guofenglinux@gmail.com> wrote: > if (extra->inner_unique && > (inner_path->param_info == NULL || > list_length(inner_path->param_info->ppi_clauses) < > list_length(extra->restrictlist))) > return NULL;
I see. So it worked fine until Tom's " Make Vars be outer-join-aware." stuff.
9e215378d fixed this already and looks like I came to the same conclusion about the cache completion issue back then too.
Exactly.
> But why does the code above fail to capture the problematic query? It > turns out that inner_path->param_info->ppi_clauses contains cloned > clauses, i.e. multiple versions of the same clause in order to cope with > outer join identity 3. And this causes the above code to give incorrect > conclusions about the length of 'ppi_clauses'. And this also explains > why this issue only occurs in v16. > > As a clause's serial number is unique within the current PlannerInfo > context, and multiple clones of the same clause have the same serial > number, it seems to me that it's more correct to calculate the length of > ppi_clauses by: > > bms_num_members(inner_path->param_info->ppi_serials) > > So I think maybe we can fix this issue with the attached patch.
hmm, I'd need to study what Tom's been changing around this. I don't have a good handle on which qual lists can have duplicated quals.
In [1], you can see I took the inspiration for that fix from generate_mergejoin_paths(). I don't yet know if it's just ppi_clauses that can have duplicates or if there's a similar hazard in the list_length() checks in generate_mergejoin_paths()
AFAIU we may have clone clauses in RelOptInfo.joininfo to cope with commuted-left-join cases per identity 3. When we build ParamPathInfo for parameterized paths we may move join clauses to ppi_clauses when possible, so ParamPathInfo.ppi_clauses may also have clone clauses.
When we construct restrict list for a join relation from the joininfo lists of the relations it joins, only one of the clone clauses would be chosen to be put in the restrict list if they are restriction clauses, thanks to the check against required_relids and incompatible_relids.
if (rinfo->has_clone || rinfo->is_clone) { Assert(!RINFO_IS_PUSHED_DOWN(rinfo, joinrel->relids)); if (!bms_is_subset(rinfo->required_relids, both_input_relids)) continue; if (bms_overlap(rinfo->incompatible_relids, both_input_relids)) continue; }
The mergeclause_list examined in generate_mergejoin_paths() is extracted from joinrel's restrictlist, so it will not have multiple versions for one clause. Therefore, I think we're good with the list_length() checks in generate_mergejoin_paths().
> BTW, the SingleRow in explain seems quite handy. I wonder if we can > keep it.
If the fix is to just disable Memoize when the ppi_clauses don't cover the entire join condition, then the Inner Unique from the parent nested loop tells us this. Isn't that enough? The reason I added SingleRow was because in the patch I posted, I'd made it so SingleRow could be false in cases where Inner Unique was true.