Re: Postgres 16.1 - Bug: cache entry already complete - Mailing list pgsql-bugs

From Richard Guo
Subject Re: Postgres 16.1 - Bug: cache entry already complete
Date
Msg-id CAMbWs4-_kPdpTmekyE_ANR9kVqkbrALFPwG8x3o3zm__jNgRMw@mail.gmail.com
Whole thread Raw
In response to Re: Postgres 16.1 - Bug: cache entry already complete  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Postgres 16.1 - Bug: cache entry already complete  (David Rowley <dgrowleyml@gmail.com>)
Re: Postgres 16.1 - Bug: cache entry already complete  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-bugs

On Wed, Jan 3, 2024 at 9:48 AM David Rowley <dgrowleyml@gmail.com> wrote:
My concern with that is that the Unique Join optimisation will cause
execution to skip to the next outer row and that will leave us no
means of marking the Memoize cache entry as complete.  In
singlerow==false Memoize nodes, we only mark the cache as complete
when we read the inner node to completion.   The unique join
optimisation means we often don't do that due to skipping to the next
outer row on finding the first inner match.

Basically, what I'm saying is that Memoize is going to result in many
more cache misses due to incomplete cache entries.  Maybe we should
have get_memoize_path() return NULL for this case so that we don't
Memoize when there's a Join Filter and extra->inner_unique is set to
true.

I agree that we should just avoid making a memoize path in this case.
And I think we already have this logic in get_memoize_path():

    if (extra->inner_unique &&
        (inner_path->param_info == NULL ||
         list_length(inner_path->param_info->ppi_clauses) <
         list_length(extra->restrictlist)))
        return NULL;

But why does the code above fail to capture the problematic query?  It
turns out that inner_path->param_info->ppi_clauses contains cloned
clauses, i.e. multiple versions of the same clause in order to cope with
outer join identity 3.  And this causes the above code to give incorrect
conclusions about the length of 'ppi_clauses'.  And this also explains
why this issue only occurs in v16.

As a clause's serial number is unique within the current PlannerInfo
context, and multiple clones of the same clause have the same serial
number, it seems to me that it's more correct to calculate the length of
ppi_clauses by:

    bms_num_members(inner_path->param_info->ppi_serials)

So I think maybe we can fix this issue with the attached patch.

BTW, the SingleRow in explain seems quite handy.  I wonder if we can
keep it.

Thanks
Richard
Attachment

pgsql-bugs by date:

Previous
From: tender wang
Date:
Subject: Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition
Next
From: PG Bug reporting form
Date:
Subject: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.