Thread: An incorrect check in get_memoize_path

An incorrect check in get_memoize_path

From
Richard Guo
Date:
While reviewing another patch [1], I noticed an incorrect check in
get_memoize_path.

    if (extra->inner_unique &&
        (inner_path->param_info == NULL ||
         bms_num_members(inner_path->param_info->ppi_serials) <
         list_length(extra->restrictlist)))
        return NULL;

This check is to ensure that, for unique joins, all the restriction
clauses are parameterized to enable the use of Memoize nodes.
However, ppi_clauses includes join clauses available from all outer
rels, not just the current outer rel, while extra->restrictlist only
includes the restriction clauses for the current join.  This means the
check could pass even if a restriction clause isn't parameterized, as
long as another join clause, which doesn't belong to the current join,
is included in ppi_clauses.  It's not very hard to construct a query
that exposes this issue.

create table t (a int, b int);

explain (costs off)
select * from t t0 left join
  (t t1 left join t t2 on t1.a = t2.a
   join lateral
     (select distinct on (a, b) a, b, t0.a+t1.a+t2.a as x from t t3 offset 0) t3
         on t1.a = t3.a and coalesce(t2.b) = t3.b)
  on t0.b = t3.b;
                       QUERY PLAN
---------------------------------------------------------
 Nested Loop Left Join
   ->  Seq Scan on t t0
   ->  Nested Loop
         Join Filter: (COALESCE(t2.b) = t3.b)
         ->  Hash Left Join
               Hash Cond: (t1.a = t2.a)
               ->  Seq Scan on t t1
               ->  Hash
                     ->  Seq Scan on t t2
         ->  Subquery Scan on t3
               Filter: ((t0.b = t3.b) AND (t1.a = t3.a))
               ->  Unique
                     ->  Sort
                           Sort Key: t3_1.a, t3_1.b
                           ->  Seq Scan on t t3_1
(15 rows)

Consider the join to t3.  It is a unique join, and not all of its
restriction clauses are parameterized.  Despite this, the check still
passes.

Interestingly, although the check passes, no Memoize nodes are added
on top of t3's paths because paraminfo_get_equal_hashops insists that
all ppi_clauses must satisfy clause_sides_match_join (though I
actually question whether this is necessary, but that's a separate
issue).  However, this does not justify the check being correct.

Hence, I propose the attached patch for the fix.

BTW, there is a XXX comment there saying that maybe we can make the
remaining join quals part of the inner scan's filter instead of the
join filter.  I don't think this is possible in all cases.  In the
above query, 'coalesce(t2.b) = t3.b' cannot be made part of t3's scan
filter, according to join_clause_is_movable_into.  So I removed that
comment in the patch while we're here.

Any thoughts?

[1] https://postgr.es/m/60bf8d26-7c7e-4915-b544-afdb9020011d@gmail.com

Thanks
Richard

Attachment

Re: An incorrect check in get_memoize_path

From
wenhui qiu
Date:
Hi Richard
Thank you work on this

- * - * XXX this could be enabled if the remaining join quals were made part of - * the inner scan's filter instead of the join filter. Maybe it's worth - * considering doing that? */ - if (extra->inner_unique && - (inner_path->param_info == NULL || - bms_num_members(inner_path->param_info->ppi_serials) < - list_length(extra->restrictlist))) - return NULL; + if (extra->inner_unique) + { + Bitmapset *ppi_serials; + + if (inner_path->param_info == NULL) + return NULL; + + ppi_serials = inner_path->param_info->ppi_serials; + + foreach_node(RestrictInfo, rinfo, extra->restrictlist) + { + if (!bms_is_member(rinfo->rinfo_serial, ppi_serials)) + return NULL; + } + } The refined version introduces a more precise validation by:
1. Explicitly verifying each restriction – Instead of relying on a count comparison, it checks whether every restriction's serial number is included in the inner path’s parameter info (ppi_serials).
2. Reducing false negatives – The optimizer now only discards a path if a restriction is definitively unhandled, allowing more valid plans to be considered.


On Mon, Apr 7, 2025 at 3:50 PM Richard Guo <guofenglinux@gmail.com> wrote:
While reviewing another patch [1], I noticed an incorrect check in
get_memoize_path.

    if (extra->inner_unique &&
        (inner_path->param_info == NULL ||
         bms_num_members(inner_path->param_info->ppi_serials) <
         list_length(extra->restrictlist)))
        return NULL;

This check is to ensure that, for unique joins, all the restriction
clauses are parameterized to enable the use of Memoize nodes.
However, ppi_clauses includes join clauses available from all outer
rels, not just the current outer rel, while extra->restrictlist only
includes the restriction clauses for the current join.  This means the
check could pass even if a restriction clause isn't parameterized, as
long as another join clause, which doesn't belong to the current join,
is included in ppi_clauses.  It's not very hard to construct a query
that exposes this issue.

create table t (a int, b int);

explain (costs off)
select * from t t0 left join
  (t t1 left join t t2 on t1.a = t2.a
   join lateral
     (select distinct on (a, b) a, b, t0.a+t1.a+t2.a as x from t t3 offset 0) t3
         on t1.a = t3.a and coalesce(t2.b) = t3.b)
  on t0.b = t3.b;
                       QUERY PLAN
---------------------------------------------------------
 Nested Loop Left Join
   ->  Seq Scan on t t0
   ->  Nested Loop
         Join Filter: (COALESCE(t2.b) = t3.b)
         ->  Hash Left Join
               Hash Cond: (t1.a = t2.a)
               ->  Seq Scan on t t1
               ->  Hash
                     ->  Seq Scan on t t2
         ->  Subquery Scan on t3
               Filter: ((t0.b = t3.b) AND (t1.a = t3.a))
               ->  Unique
                     ->  Sort
                           Sort Key: t3_1.a, t3_1.b
                           ->  Seq Scan on t t3_1
(15 rows)

Consider the join to t3.  It is a unique join, and not all of its
restriction clauses are parameterized.  Despite this, the check still
passes.

Interestingly, although the check passes, no Memoize nodes are added
on top of t3's paths because paraminfo_get_equal_hashops insists that
all ppi_clauses must satisfy clause_sides_match_join (though I
actually question whether this is necessary, but that's a separate
issue).  However, this does not justify the check being correct.

Hence, I propose the attached patch for the fix.

BTW, there is a XXX comment there saying that maybe we can make the
remaining join quals part of the inner scan's filter instead of the
join filter.  I don't think this is possible in all cases.  In the
above query, 'coalesce(t2.b) = t3.b' cannot be made part of t3's scan
filter, according to join_clause_is_movable_into.  So I removed that
comment in the patch while we're here.

Any thoughts?

[1] https://postgr.es/m/60bf8d26-7c7e-4915-b544-afdb9020011d@gmail.com

Thanks
Richard

Re: An incorrect check in get_memoize_path

From
Andrei Lepikhov
Date:
On 4/7/25 09:50, Richard Guo wrote:
> Consider the join to t3.  It is a unique join, and not all of its
> restriction clauses are parameterized.  Despite this, the check still
> passes.
At least, this code looks more simple to understand, more 'armored' and 
worth to change.
At the same time I think term 'Incorrect' is not good unless you show an 
example where data returned is not consistent to the expected.
I think this inequality check has worked in couple with the 
get_equal_hashops.
I think it may be pushed as is. But it worth to discover the 
get_equal_hashops more deeply. Discovering your idea I wrote an example 
(see attachment) which (with commented clause_sides_match_join) provides 
an incorrect Memoize - that's why I ended up with support of your change 
even when you can't show any buggy behaviour. But I've got an assertion 
that is different from the error I expected to see (error on single_row 
mode):

#0  0x00005583c45532ce in CheckVarSlotCompatibility
#1  0x00005583c4553244 in CheckExprStillValid
#2  0x00005583c45530ea in ExecInterpExprStillValid
#3  0x00005583c45a245c in ExecEvalExpr
#4  0x00005583c45a3985 in prepare_probe_slot
#5  0x00005583c45a3e52 in cache_lookup
#6  0x00005583c45a4313 in ExecMemoize

It looks strange.

-- 
regards, Andrei Lepikhov
Attachment

Re: An incorrect check in get_memoize_path

From
Richard Guo
Date:
On Mon, Apr 7, 2025 at 9:54 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> On 4/7/25 09:50, Richard Guo wrote:
> > Consider the join to t3.  It is a unique join, and not all of its
> > restriction clauses are parameterized.  Despite this, the check still
> > passes.

> At the same time I think term 'Incorrect' is not good unless you show an
> example where data returned is not consistent to the expected.
> I think this inequality check has worked in couple with the
> get_equal_hashops.

Do you mean this check is designed to work in conjunction with the
clause_sides_match_join check in paraminfo_get_equal_hashops?  I would
consider this a very poor design.  The purpose of this check is simply
to verify whether all restriction clauses are parameterized.  I don't
understand why it needs to depend on paraminfo_get_equal_hashops in
such an unexpected way to function correctly.  I can't see any
advantage to this design, other than that it's prone to bugs.
Moreover, in the case where not all restriction clauses are
parameterized, why waste CPU cycles running all the code after the
check only for paraminfo_get_equal_hashops to catch it later?
Additionally, I couldn't find any comments explaining this unusual
behavior, either in the check itself or in paraminfo_get_equal_hashops.

Thanks
Richard



Re: An incorrect check in get_memoize_path

From
Andrei Lepikhov
Date:
On 4/8/25 08:32, Richard Guo wrote:
> On Mon, Apr 7, 2025 at 9:54 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
>> On 4/7/25 09:50, Richard Guo wrote:
>>> Consider the join to t3.  It is a unique join, and not all of its
>>> restriction clauses are parameterized.  Despite this, the check still
>>> passes.
> 
>> At the same time I think term 'Incorrect' is not good unless you show an
>> example where data returned is not consistent to the expected.
>> I think this inequality check has worked in couple with the
>> get_equal_hashops.
> 
> Do you mean this check is designed to work in conjunction with the
> clause_sides_match_join check in paraminfo_get_equal_hashops?  I would
> consider this a very poor design.
As I have written before, I am quite happy with the change you propose. 
I just pointed out that term 'incorrect' usually means you may provide a 
query which causes an error or inconsistent data which we can add to the 
tests set. Current code may be described as 'kludge' lines - but I'm not 
a native speaker, don't bikeshed here too much.

-- 
regards, Andrei Lepikhov



Re: An incorrect check in get_memoize_path

From
Richard Guo
Date:
On Mon, Apr 7, 2025 at 4:50 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Hence, I propose the attached patch for the fix.
>
> BTW, there is a XXX comment there saying that maybe we can make the
> remaining join quals part of the inner scan's filter instead of the
> join filter.  I don't think this is possible in all cases.  In the
> above query, 'coalesce(t2.b) = t3.b' cannot be made part of t3's scan
> filter, according to join_clause_is_movable_into.  So I removed that
> comment in the patch while we're here.
>
> Any thoughts?

Here is an updated patch with a commit message.  Regarding
backporting, I'm inclined not to, given the lack of field reports.
Any objections to pushing it?

Thanks
Richard

Attachment

Re: An incorrect check in get_memoize_path

From
Andrei Lepikhov
Date:
On 4/14/25 08:49, Richard Guo wrote:
> On Mon, Apr 7, 2025 at 4:50 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> Hence, I propose the attached patch for the fix.
>>
>> BTW, there is a XXX comment there saying that maybe we can make the
>> remaining join quals part of the inner scan's filter instead of the
>> join filter.  I don't think this is possible in all cases.  In the
>> above query, 'coalesce(t2.b) = t3.b' cannot be made part of t3's scan
>> filter, according to join_clause_is_movable_into.  So I removed that
>> comment in the patch while we're here.
>>
>> Any thoughts?
> 
> Here is an updated patch with a commit message.  Regarding
> backporting, I'm inclined not to, given the lack of field reports.
> Any objections to pushing it?
No objections.
By the way, I think you should be less shy about adding to CC people who 
have been involved in the discussion (at least David should have a 
chance to know). Also, don't hesitate to use the 'Reviewed-by' tag to 
let people know who else may remember the context of the problem, just 
in case.

-- 
regards, Andrei Lepikhov



Re: An incorrect check in get_memoize_path

From
wenhui qiu
Date:
HI 
  No objections.It's a pity that the postgresql18 version has been code-frozen


Thanks 

On Mon, Apr 14, 2025 at 4:21 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 4/14/25 08:49, Richard Guo wrote:
> On Mon, Apr 7, 2025 at 4:50 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> Hence, I propose the attached patch for the fix.
>>
>> BTW, there is a XXX comment there saying that maybe we can make the
>> remaining join quals part of the inner scan's filter instead of the
>> join filter.  I don't think this is possible in all cases.  In the
>> above query, 'coalesce(t2.b) = t3.b' cannot be made part of t3's scan
>> filter, according to join_clause_is_movable_into.  So I removed that
>> comment in the patch while we're here.
>>
>> Any thoughts?
>
> Here is an updated patch with a commit message.  Regarding
> backporting, I'm inclined not to, given the lack of field reports.
> Any objections to pushing it?
No objections.
By the way, I think you should be less shy about adding to CC people who
have been involved in the discussion (at least David should have a
chance to know). Also, don't hesitate to use the 'Reviewed-by' tag to
let people know who else may remember the context of the problem, just
in case.

--
regards, Andrei Lepikhov