Re: BUG #7556: "select not in sub query" plan very poor vs "not exists" - Mailing list pgsql-bugs

From Kevin Grittner
Subject Re: BUG #7556: "select not in sub query" plan very poor vs "not exists"
Date
Msg-id 5059C36B020000250004A505@gw.wicourts.gov
Whole thread Raw
In response to BUG #7556: "select not in sub query" plan very poor vs "not exists"  (l1t@tom.com)
List pgsql-bugs
<l1t@tom.com> wrote:

> test=# explain select max(a.info)from sli_test a where a.id not
> in(select b.id from sli_test2 b where b.id<50000);

>  Aggregate  (cost=9241443774.00..9241443774.01 rows=1 width=12)

A slower plan for NOT IN than NOT EXISTS is a fact, but definitely
not a bug.  According to the standard, the semantics of NOT IN are
different from NOT EXISTS when there is a possibility of NULLs on
either side.  Because of those different semantics, NOT IN cannot
use certain optimizations which are available for NOT EXISTS.

Technically, if a NOT IN case could be analyzed to the point where
it is clear that both sides of the predicate are definitely free of
NULLs, it could be run the same as NOT EXISTS, but that would add
complexity and run-time expense  to the optimizer.  Some cases are
simple, but some are not at all practical.  We have chosen instead
to recommend that people use NOT EXISTS unless they really want the
rather astonishing behavior of NOT IN.

-Kevin

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #7559: syslogger doesn't close stdout and stderr
Next
From: Tom Lane
Date:
Subject: Re: BUG #7557: Transaction sees outdated grants.