Josh Berkus <josh@agliodbs.com> writes:
> dm=# explain
> dm-# SELECT personid FROM mr.person_attributes_old
> dm-# WHERE personid NOT IN (SELECT
> personid FROM mr.person_attributes);
> QUERY PLAN
> -----------------------------------------------------------------------------------
> Seq Scan on person_attributes_old (cost=0.00..3226144059.85 rows=235732
> width=4)
> Filter: (NOT (subplan))
> SubPlan
> -> Seq Scan on person_attributes (cost=0.00..12671.07 rows=405807
> width=4)
> (4 rows)
Hmm. What you want for a NOT IN is for it to say
Filter: (NOT (hashed subplan))
which you are not getting. What's the datatypes of the two personid
columns? Is the 400k-row estimate for person_attributes reasonable?
Maybe you need to increase work_mem (nee sort_mem) to allow a
400k-row hash table?
regards, tom lane