Thread: NOT IN Doesn't use Anti Joins?
I'm sure there is a good reason why NOT IN will not use an Anti-Join plan equivalent to NOT EXISTS due to NULL handling, but in this particular case the value being compared is in the PRIMARY KEY of both structures being joined. The NOT IN plan was killed after 10 minutes. The NOT EXISTS plan returned data in roughly 10ms. Is there a reason why the NOT IN plan could not use Anti-Joins when the column being compared against is guaranteed to be NOT NULL? Too much planner overhead to determine nullness of the column? sk=# explain select * from source_reb_listing where listing_id not in (select listing_id from source_reb_listing_specs) order by file_id desc limit 5; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=729015.39..3420463.83 rows=5 width=28) -> Index Scan Backward using source_reb_listing_fileid_idx on source_reb_listing (cost=729015.39..169537219655.96 rows=314954 width=28) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Materialize (cost=729015.39..1185280.74 rows=32810035width=8) -> Seq Scan on source_reb_listing_specs (cost=0.00..568040.35 rows=32810035 width=8) (6 rows) sk=# explain select * from source_reb_listing where not exists (select * from source_reb_listing_specs as t where t.listing_id = source_reb_listing.listing_id) order by file_id desc limit 5; QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------Limit (cost=0.00..35.31 rows=5 width=28) -> Nested Loop Anti Join (cost=0.00..3880495.87 rows=549496 width=28) -> IndexScan Backward using source_reb_listing_fileid_idx on source_reb_listing (cost=0.00..1107142.20 rows=629907 width=28) -> Index Scan using source_reb_listing_specs_pkeyon source_reb_listing_specs t (cost=0.00..1592.74 rows=408 width=8) Index Cond: (t.listing_id = source_reb_listing.listing_id) (5 rows)
On Thu, Dec 17, 2009 at 9:02 AM, Rod Taylor <rod.taylor@gmail.com> wrote: > Is there a reason why the NOT IN plan could not use Anti-Joins when > the column being compared against is guaranteed to be NOT NULL? Too > much planner overhead to determine nullness of the column? I doubt it. I think it's just a question of round tuits. I think Tom hasn't felt it worth the effort since you can work around it by rewriting the query, and nobody else has bothered to work up a patch. If you feel like working on it, I think it would be a nice improvement. ...Robert