Grzegorz Jaskiewicz <gj@pointblue.com.pl> writes:
> gj=# explain select a.a from a where a not in (select a from b);
> QUERY PLAN
> -------------------------------------------------------------------------
> Seq Scan on a (cost=99035.00..257874197565.00 rows=3000000 width=4)
> Filter: (NOT (subplan))
> SubPlan
> -> Materialize (cost=99035.00..171493.00 rows=5400000 width=4)
> -> Seq Scan on b (cost=0.00..75177.00 rows=5400000 width=4)
> (5 rows)
>
>
> that's absolutely humongous cost, and it really does take ages before this
> thing finishes (had to kill it after an hour).
I think Postgres can't do better because there could be a NULL in the
subquery. If there's a NULL in the subquery then no record would match.
Now your column is NOT NULL so Postgres could do better but AFAIK we don't
look at column constraints like NOT NULL when planning. Historically we
couldn't because we didn't have plan invalidation -- and the plan you posted
below with the Anti-Join is brand new in 8.4 -- so there is room for
improvement but it's not exactly a bug.
-- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning