Re: Memoize ANTI and SEMI JOIN inner - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Memoize ANTI and SEMI JOIN inner
Date
Msg-id CAMbWs4-czfhkn+RaA8RXL-sf+NZXDZ8fOYbTOytyY6KbVGz+mQ@mail.gmail.com
Whole thread Raw
In response to Re: Memoize ANTI and SEMI JOIN inner  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Memoize ANTI and SEMI JOIN inner
Re: Memoize ANTI and SEMI JOIN inner
List pgsql-hackers
On Thu, Mar 20, 2025 at 3:02 PM David Rowley <dgrowleyml@gmail.com> wrote:
> For making this work, I think the attached should be about the guts of
> the code changes. I didn't look at the comments. Right now I can't
> think of any reason why this can't be done, but some experimentation
> might reveal some reason that it can't.

I conducted some experiments, and I'm afraid it's not safe to consider
Memoize for semi or anti joins, unless the inner side is provably
unique.  As an example, please consider:

create table t (a int, b boolean);
insert into t select i%3, false from generate_series(1,100)i;
analyze t;

select * from t t1 where t1.a in
 (select a from t t2 where t2.b in
  (select t1.b from t t3 where t2.a > 1 offset 0));
ERROR:  cache entry already complete

With the proposed patch, this query results in an error.

The problem is that join clauses from the upper level may be moved to
the semi join.  For a given outer tuple, the first inner tuple that
satisfies the current parameters will mark the cache entry as complete
because singlerow is set to true.  However, if that inner tuple and
the current outer tuple don't satisfy the join clauses, the second
inner tuple that satisfies the parameters will complain that the cache
entry is already marked as complete.

If the inner side is provably unique, there will be no such problem,
as there won't be a second matching tuple.  OTOH, in this case, the
semi join will be reduced to an inner join by reduce_unique_semijoins.
Therefore, it doesn't make much sense to prove inner_unique for semi
joins in add_paths_to_joinrel.

Perhaps we could spend some planner cycles proving inner_unique for
anti joins, so that Memoize nodes can be considered for them?

Thanks
Richard



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Fix 035_standby_logical_decoding.pl race conditions
Next
From: John Naylor
Date:
Subject: Re: Feature freeze