Re: Postgres 16.1 - Bug: cache entry already complete - Mailing list pgsql-bugs

From Richard Guo
Subject Re: Postgres 16.1 - Bug: cache entry already complete
Date
Msg-id CAMbWs4-522XgUyK3BR3S6U-yiY36wLv7acP9Bf6mJS0zkXJDNw@mail.gmail.com
Whole thread Raw
In response to Re: Postgres 16.1 - Bug: cache entry already complete  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-bugs

On Thu, Jan 4, 2024 at 8:04 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 4 Jan 2024 at 00:01, Richard Guo <guofenglinux@gmail.com> wrote:
>     if (extra->inner_unique &&
>         (inner_path->param_info == NULL ||
>          list_length(inner_path->param_info->ppi_clauses) <
>          list_length(extra->restrictlist)))
>         return NULL;

I see. So it worked fine until Tom's " Make Vars be outer-join-aware." stuff.

9e215378d fixed this already and looks like I came to the same
conclusion about the cache completion issue back then too.

Exactly.
 
> But why does the code above fail to capture the problematic query?  It
> turns out that inner_path->param_info->ppi_clauses contains cloned
> clauses, i.e. multiple versions of the same clause in order to cope with
> outer join identity 3.  And this causes the above code to give incorrect
> conclusions about the length of 'ppi_clauses'.  And this also explains
> why this issue only occurs in v16.
>
> As a clause's serial number is unique within the current PlannerInfo
> context, and multiple clones of the same clause have the same serial
> number, it seems to me that it's more correct to calculate the length of
> ppi_clauses by:
>
>     bms_num_members(inner_path->param_info->ppi_serials)
>
> So I think maybe we can fix this issue with the attached patch.

hmm, I'd need to study what Tom's been changing around this. I don't
have a good handle on which qual lists can have duplicated quals.

In [1], you can see I took the inspiration for that fix from
generate_mergejoin_paths().  I don't yet know if it's just ppi_clauses
that can have duplicates or if there's a similar hazard in the
list_length() checks in generate_mergejoin_paths()

AFAIU we may have clone clauses in RelOptInfo.joininfo to cope with
commuted-left-join cases per identity 3.  When we build ParamPathInfo
for parameterized paths we may move join clauses to ppi_clauses when
possible, so ParamPathInfo.ppi_clauses may also have clone clauses.

When we construct restrict list for a join relation from the joininfo
lists of the relations it joins, only one of the clone clauses would be
chosen to be put in the restrict list if they are restriction clauses,
thanks to the check against required_relids and incompatible_relids.

    if (rinfo->has_clone || rinfo->is_clone)
    {
        Assert(!RINFO_IS_PUSHED_DOWN(rinfo, joinrel->relids));
        if (!bms_is_subset(rinfo->required_relids, both_input_relids))
            continue;
        if (bms_overlap(rinfo->incompatible_relids, both_input_relids))
            continue;
    }

The mergeclause_list examined in generate_mergejoin_paths() is extracted
from joinrel's restrictlist, so it will not have multiple versions for
one clause.  Therefore, I think we're good with the list_length() checks
in generate_mergejoin_paths().
 
> BTW, the SingleRow in explain seems quite handy.  I wonder if we can
> keep it.

If the fix is to just disable Memoize when the ppi_clauses don't cover
the entire join condition, then the Inner Unique from the parent
nested loop tells us this. Isn't that enough?  The reason I added
SingleRow was because in the patch I posted, I'd made it so SingleRow
could be false in cases where Inner Unique was true.

Ah yes, you're right.

Thanks
Richard

pgsql-bugs by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: BUG #17798: Incorrect memory access occurs when using BEFORE ROW UPDATE trigger
Next
From: PG Bug reporting form
Date:
Subject: BUG #18283: vacuum full use a large amount of memory (may cause OOM)