Re: Converting NOT IN to anti-joins during planning - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Converting NOT IN to anti-joins during planning
Date
Msg-id CAKU4AWrehmcip9tzXdmr4dXFm7qNYEOJEWd6hG0UdYzKChBqpA@mail.gmail.com
Whole thread Raw
In response to Re: Converting NOT IN to anti-joins during planning  (Antonin Houska <ah@cybertec.at>)
List pgsql-hackers


On Mon, May 27, 2019 at 4:44 PM Antonin Houska <ah@cybertec.at> wrote:
David Rowley <david.rowley@2ndquadrant.com> wrote:

> On Wed, 6 Mar 2019 at 12:54, David Rowley <david.rowley@2ndquadrant.com> wrote:
> > The latest patch is attached.
>
> Rebased version after pgindent run.

I've spent some time looking into this.

One problem I see is that SubLink can be in the JOIN/ON clause and thus it's
not necessarily at the top of the join tree. Consider this example:

CREATE TABLE a(i int);
CREATE TABLE b(j int);
CREATE TABLE c(k int NOT NULL);
CREATE TABLE d(l int);

  SELECT *
    FROM
        a
        JOIN b ON b.j NOT IN
                ( SELECT
                        c.k
                    FROM
                        c)
        JOIN d ON b.j = d.l;

Here the b.j=d.l condition makes the planner think that the "b.j NOT IN
(SELECT c.k FROM c)" sublink cannot receive NULL values of b.j, but that's not
true: it's possible that ((a JOIN b) ANTI JOIN c) is evaluated before "d" is
joined to the other tables, so the NULL values of b.j are not filtered out
early enough.


Would this be an issue?  Suppose the b.j is NULL when ((a JOIN b) ANTI JOIN c)
is evaluated,  after the evaluation, the NULL is still there.  and can be filtered
out later with  b.j = d.l;  Am I missing something? 


--
Best Regards

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: pgsql: autovacuum: handle analyze for partitioned tables
Next
From: Andrew Dunstan
Date:
Subject: Re: multi-install PostgresNode fails with older postgres versions