Re: IN surpasses NOT EXISTS in 7.4RC2 ?? - Mailing list pgsql-performance

From Rajesh Kumar Mallah
Subject Re: IN surpasses NOT EXISTS in 7.4RC2 ??
Date
Msg-id 3FB52CAD.9070404@trade-india.com
Whole thread Raw
In response to Re: IN surpasses NOT EXISTS in 7.4RC2 ??  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane wrote:
Rajesh Kumar Mallah <mallah@trade-india.com> writes: 
NOT EXISTS is taking almost double time than NOT IN .
I know IN has been optimised in 7.4 but is anything 
wrong with the NOT EXISTS?   
That's the expected behavior in 7.4.  EXISTS in the style you are using
it effectively forces a nestloop-with-inner-indexscan implementation.
As of 7.4, IN can do that, but it can do several other things too,
including the hash-type plan you have here.  So assuming that the
planner chooses the right plan choice (not always a given ;-))

IN should be as fast or faster than EXISTS in all cases.

Not in this case :) , did i miss something silly?

tradein_clients=# explain  SELECT count(*) from user_accounts where email is not null and email not in
 (select email from profile_master where email is not null) ;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Aggregate  (cost=9587726326.93..9587726326.93 rows=1 width=0)
   ->  Seq Scan on user_accounts  (cost=0.00..9587725473.40 rows=341412 width=0)
         Filter: ((email IS NOT NULL) AND (NOT (subplan)))
         SubPlan
           ->  Seq Scan on profile_master  (cost=0.00..25132.24 rows=674633 width=25)
                 Filter: (email IS NOT NULL)
(6 rows)


The query above does not return

tradein_clients=# explain analyze SELECT count(*) from user_accounts where email is not null and
not exists (select email from profile_master where email=user_accounts.email) ;
                                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=2850847.55..2850847.55 rows=1 width=0) (actual time=34075.100..34075.101 rows=1 loops=1)
   ->  Seq Scan on user_accounts  (cost=0.00..2849994.02 rows=341412 width=0) (actual time=8.066..34066.329 rows=3882 loops=1)
         Filter: ((email IS NOT NULL) AND (NOT (subplan)))
         SubPlan
           ->  Index Scan using profile_master_email on profile_master  (cost=0.00..35.60 rows=9 width=25) (actual time=0.044..0.044 rows=1 loops=686716)
                 Index Cond: ((email)::text = ($0)::text)
 Total runtime: 34075.213 ms
(7 rows)

tradein_clients=#


		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org 


-- 

Rajesh Kumar Mallah,
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

pgsql-performance by date:

Previous
From: "Arthur Ward"
Date:
Subject: Re: Union+group by planner estimates way off?
Next
From: Josh Berkus
Date:
Subject: Storage space, RAM for NUMERIC