Re: Anti join confusion - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Anti join confusion
Date
Msg-id CA+TgmobrV97qEMMv=mQ-H8se6AqS2+U4Sz3iWiyF4tURauQdjA@mail.gmail.com
Whole thread Raw
In response to Re: Anti join confusion  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Anti join confusion
List pgsql-hackers
On Wed, Feb 26, 2025 at 4:46 AM Richard Guo <guofenglinux@gmail.com> wrote:
> I agree that it'd be beneficial to make some improvements to NOT IN
> subqueries.  From what I can see, we may have two potential options:
>
> * As Tom mentioned, we can prove that the subquery's output never
> contains NULL values and then convert the NOT IN into an anti-join.
> (It seems to me that we would also need to prove that the outer side
> never contains NULL values either, because whether the NULL values
> from the outer side should be included in the output depends on
> whether the inner side is empty.)
>
> * We can add support in the executor to handle the NULL semantics of
> NOT IN.  This may require inventing a new join type.
>
> I'm not quite sure which option is more promising at the moment, or if
> there are other options to consider.

I'm not quite sure, either. I think that the first option (proving
that there can be no NULL values) is probably more similar to
thingswe've done elsewhere in the planner, so I think I have been
assuming that if we did something about this, it would be that.
However, I have also had the idea of extending the executor to handle
this situation, and it's possible that option is more promising. I
don't really know. I think it's pretty common to have NOT IN clauses
where one can be certain that there definitely isn't a NULL present,
but unfortunately I think it's also pretty common to have cases where
a NULL could be present, or at least, where one cannot provide that no
NULL can be present.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: SQLFunctionCache and generic plans
Next
From: "Daniel Verite"
Date:
Subject: Re: pgbench client-side performance issue on large scripts