Re: Memoize ANTI and SEMI JOIN inner - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Memoize ANTI and SEMI JOIN inner
Date
Msg-id CAMbWs49G-w9UMEBeMVj_ZT6-W-3iv=1xN8QUy_gLSiGf_VXSUw@mail.gmail.com
Whole thread Raw
In response to Re: Memoize ANTI and SEMI JOIN inner  (Andrei Lepikhov <lepihov@gmail.com>)
Responses Re: Memoize ANTI and SEMI JOIN inner
List pgsql-hackers
On Mon, Mar 31, 2025 at 6:46 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
> On 3/31/25 11:03, Richard Guo wrote:
> > I reviewed this patch and I have some concerns about the following
> > code:
> >
> >      if (extra->inner_unique &&
> >          (inner_path->param_info == NULL ||
> >           bms_num_members(inner_path->param_info->ppi_serials) <
> >           list_length(extra->restrictlist)))
> >          return NULL;
> >
> > I understand that this check is used to ensure that the entire join
> > condition is parameterized in the case of unique joins, so that we can
> > safely mark the cache entry as complete after reading the first tuple.
> > However, ppi_clauses includes join clauses available from all outer
> > rels, not just the current outer rel, while extra->restrictlist only
> > includes the restriction clauses for the current join.  This means the
> > check could pass even if a restriction clause isn't parameterized, as
> > long as another join clause, which doesn't belong to the current join,
> > is included in ppi_clauses.

> Initially, I had the same concern. But if ppi_clauses contains a qual,
> it should refer to this join and, as a result, be in the
> extra->restrictlist, isn't it?

Hmm, I don't think so.  As I mentioned upthread, ppi_clauses includes
join clauses available from all outer rels, not just the current one.
So a clause included in ppi_clauses is not necessarily included in
extra->restrictlist.  As an example, consider

create table t (a int, b int);

explain (costs off)
select * from t t1 join t t2 join
    lateral (select *, t1.a+t2.a as x from t t3 offset 0) t3
    on t2.a = t3.a
  on t1.b = t3.b;
                       QUERY PLAN
---------------------------------------------------------
 Nested Loop
   ->  Seq Scan on t t2
   ->  Nested Loop
         ->  Seq Scan on t t1
         ->  Subquery Scan on t3
               Filter: ((t2.a = t3.a) AND (t1.b = t3.b))
               ->  Seq Scan on t t3_1
(7 rows)

t3's ppi_clauses includes "t2.a = t3.a" and "t1.b = t3.b", while t1/t3
join's restrictlist only includes "t1.b = t3.b".

Thanks
Richard



pgsql-hackers by date:

Previous
From: Jasper Smit
Date:
Subject: Re: Assertion with aborted UPDATE in subtransaction
Next
From: Alexander Pyhalov
Date:
Subject: Re: SQLFunctionCache and generic plans