Re: Anti join confusion - Mailing list pgsql-hackers

From Tender Wang
Subject Re: Anti join confusion
Date
Msg-id CAHewXN=ePsgyz8fBQv8Eyk4jU1ZgVPPo8poM=zOz+jS8jMmyQQ@mail.gmail.com
Whole thread Raw
In response to Re: Anti join confusion  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Anti join confusion
List pgsql-hackers


Richard Guo <guofenglinux@gmail.com> 于2025年2月26日周三 17:46写道:
On Tue, Feb 25, 2025 at 1:30 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 24, 2025 at 8:08 AM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
> >     Actually ,Many fork postgresql databases have already implemented ,For example, if the relevant field has a non-null constraint ,Many databases can do the same thing as not exist ( MySQL ,SQL Server,Oracle)

> I'm not surprised to hear it. Long-time PostgreSQL users just don't
> use NOT IN, so it's fine, but anyone coming from another database gets
> hosed. I think it would be good to put some effort into improving this
> area, but I do not have time to work on it myself.

I agree that it'd be beneficial to make some improvements to NOT IN
subqueries.  From what I can see, we may have two potential options:

* As Tom mentioned, we can prove that the subquery's output never
contains NULL values and then convert the NOT IN into an anti-join.
(It seems to me that we would also need to prove that the outer side
never contains NULL values either, because whether the NULL values
from the outer side should be included in the output depends on
whether the inner side is empty.)

* We can add support in the executor to handle the NULL semantics of
NOT IN.  This may require inventing a new join type.

I'm not quite sure which option is more promising at the moment, or if
there are other options to consider.


Recently, I found Greenplum implement pull-up NOT IN subquery. They have the below comments in their codes:

We normalize NOT subqueries using the following axioms:
*
* val NOT IN (subq) =>  val <> ALL (subq)

Richard, do you have an impression about this?


--
Thanks,
Tender Wang

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [BUG]: the walsender does not update its IO statistics until it exits
Next
From: Ilia Evdokimov
Date:
Subject: Re: Sample rate added to pg_stat_statements