Re: Allowing NOT IN to use ANTI joins - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Allowing NOT IN to use ANTI joins
Date
Msg-id 75801.1403711561@sss.pgh.pa.us
Whole thread Raw
In response to Re: Allowing NOT IN to use ANTI joins  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Allowing NOT IN to use ANTI joins  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> writes:
> To be clearer, what I mean is we use only the direct proof approach,
> for queries like this

>   SELECT * FROM a WHERE id NOT IN(SELECT unknown_col FROM b WHERE
> unknown_col IS NOT NULL);

> and we don't try to do it for queries like this

>   SELECT * FROM a WHERE id NOT IN(SELECT not_null_column FROM b);

> because we don't know the full provenance of "not_null_column" in all
> cases and that info is (currently) unreliable.

FWIW, I think that would largely cripple the usefulness of the patch.
If you can tell people to rewrite their queries, you might as well
tell them to rewrite into NOT EXISTS.  The real-world queries that
we're trying to improve invariably look like the latter case not the
former, because people who are running into this problem usually
aren't even thinking about the possibility of NULLs.

I would actually say that if we only have the bandwidth to get one of
these cases done, it should be the second one not the first.  It's
unclear to me that checking for the first case would even be worth
the planner cycles it'd cost.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: makeAndExpr(), etc. confined to gram.y?
Next
From: Pavel Stehule
Date:
Subject: Re: wrapping in extended mode doesn't work well with default pager