why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast? - Mailing list pgsql-performance

From Miernik
Subject why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?
Date
Msg-id 20080731024508.53F6.0.NOFFLE@turbacz.local
Whole thread Raw
Responses Re: why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Two queries which do the same thing, first one takes ages to complete
(did wait several minutes and cancelled it), while the second one took
9 seconds? Don't they do the same thing?

miernik=> EXPLAIN SELECT uid FROM locks WHERE uid NOT IN (SELECT uid FROM locks INNER JOIN wys USING (uid, login));
                                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on locks  (cost=38341.39..61365389.71 rows=48446 width=4)
   Filter: (NOT (subplan))
   SubPlan
     ->  Materialize  (cost=38341.39..39408.47 rows=79508 width=4)
           ->  Hash Join  (cost=3997.27..37989.89 rows=79508 width=4)
                 Hash Cond: (((wys.uid)::integer = (locks.uid)::integer) AND ((wys.login)::text = (locks.login)::text))
                 ->  Seq Scan on wys  (cost=0.00..13866.51 rows=633451 width=16)
                 ->  Hash  (cost=2069.91..2069.91 rows=96891 width=16)
                       ->  Seq Scan on locks  (cost=0.00..2069.91 rows=96891 width=16)
(9 rows)

Time: 231,634 ms
miernik=> EXPLAIN SELECT uid FROM locks EXCEPT (SELECT uid FROM locks INNER JOIN wys USING (uid, login));
                                                                           QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------
 SetOp Except  (cost=59306.12..60188.11 rows=17640 width=4)
   ->  Sort  (cost=59306.12..59747.12 rows=176399 width=4)
         Sort Key: "*SELECT* 1".uid
         ->  Append  (cost=0.00..41823.79 rows=176399 width=4)
               ->  Subquery Scan "*SELECT* 1"  (cost=0.00..3038.82 rows=96891 width=4)
                     ->  Seq Scan on locks  (cost=0.00..2069.91 rows=96891 width=4)
               ->  Subquery Scan "*SELECT* 2"  (cost=3997.27..38784.97 rows=79508 width=4)
                     ->  Hash Join  (cost=3997.27..37989.89 rows=79508 width=4)
                           Hash Cond: (((wys.uid)::integer = (locks.uid)::integer) AND ((wys.login)::text =
(locks.login)::text))
                           ->  Seq Scan on wys  (cost=0.00..13866.51 rows=633451 width=16)
                           ->  Hash  (cost=2069.91..2069.91 rows=96891 width=16)
                                 ->  Seq Scan on locks  (cost=0.00..2069.91 rows=96891 width=16)
(12 rows)

Time: 1479,238 ms
miernik=>

--
Miernik
http://miernik.name/

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Difference between "Explain analyze" and "\timing"
Next
From: Tom Lane
Date:
Subject: Re: why "WHERE uid NOT IN" is so slow, and EXCEPT in the same situtation is fast?