Thread: NOT IN Doesn't use Anti Joins?

NOT IN Doesn't use Anti Joins?

From
Rod Taylor
Date:
I'm sure there is a good reason why NOT IN will not use an Anti-Join
plan equivalent to NOT EXISTS due to NULL handling, but in this
particular case the value being compared is in the PRIMARY KEY of both
structures being joined.

The NOT IN plan was killed after 10 minutes. The NOT EXISTS plan
returned data in roughly 10ms.

Is there a reason why the NOT IN plan could not use Anti-Joins when
the column being compared against is guaranteed to be NOT NULL? Too
much planner overhead to determine nullness of the column?


sk=# explain select * from source_reb_listing where listing_id not in
(select listing_id from source_reb_listing_specs) order by file_id
desc limit 5;
                                                                QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=729015.39..3420463.83 rows=5 width=28)  ->  Index Scan Backward using source_reb_listing_fileid_idx on
 
source_reb_listing  (cost=729015.39..169537219655.96 rows=314954
width=28)        Filter: (NOT (SubPlan 1))        SubPlan 1          ->  Materialize  (cost=729015.39..1185280.74
rows=32810035width=8)                ->  Seq Scan on source_reb_listing_specs
 
(cost=0.00..568040.35 rows=32810035 width=8)
(6 rows)

sk=# explain select * from source_reb_listing where not exists (select
* from source_reb_listing_specs as t where t.listing_id =
source_reb_listing.listing_id) order by file_id desc limit 5;
  QUERY
 
PLAN

-----------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=0.00..35.31 rows=5 width=28)  ->  Nested Loop Anti Join  (cost=0.00..3880495.87 rows=549496 width=28)        ->
IndexScan Backward using source_reb_listing_fileid_idx
 
on source_reb_listing  (cost=0.00..1107142.20 rows=629907 width=28)        ->  Index Scan using
source_reb_listing_specs_pkeyon
 
source_reb_listing_specs t  (cost=0.00..1592.74 rows=408 width=8)              Index Cond: (t.listing_id =
source_reb_listing.listing_id)
(5 rows)


Re: NOT IN Doesn't use Anti Joins?

From
Robert Haas
Date:
On Thu, Dec 17, 2009 at 9:02 AM, Rod Taylor <rod.taylor@gmail.com> wrote:
> Is there a reason why the NOT IN plan could not use Anti-Joins when
> the column being compared against is guaranteed to be NOT NULL? Too
> much planner overhead to determine nullness of the column?

I doubt it.  I think it's just a question of round tuits.  I think Tom
hasn't felt it worth the effort since you can work around it by
rewriting the query, and nobody else has bothered to work up a patch.
If you feel like working on it, I think it would be a nice
improvement.

...Robert