Re: IN vs EXISTS equivalence - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: IN vs EXISTS equivalence
Date
Msg-id 162867790808180202p28e79d42od1e0bedb45dcaf4@mail.gmail.com
Whole thread Raw
In response to Re: IN vs EXISTS equivalence  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hello

I did some fast test on pagila database.

8.4
postgres=# explain analyze select * from film f where exists  (select
film_id from film_actor where f.film_id = film_id);                                                         QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------Hash
Join (cost=117.01..195.42 rows=966 width=390) (actual
 
time=36.011..43.314 rows=997 loops=1)  Hash Cond: (f.film_id = film_actor.film_id)  ->  Seq Scan on film f
(cost=0.00..65.00rows=1000 width=390)
 
(actual time=0.027..1.971 rows=1000 loops=1)  ->  Hash  (cost=104.94..104.94 rows=966 width=2) (actual
time=35.886..35.886 rows=997 loops=1)        ->  HashAggregate  (cost=95.28..104.94 rows=966 width=2)
(actual time=32.650..34.139 rows=997 loops=1)              ->  Seq Scan on film_actor  (cost=0.00..81.62 rows=5462
width=2) (actual time=0.081..14.232 rows=5462 loops=1)Total runtime: 45.373 ms
(7 rows)

8.3
postgres=# explain select * from film f where exists  (select film_id
from film_actor where f.film_id = film_id);                                       QUERY PLAN
------------------------------------------------------------------------------------------Seq Scan on film f
(cost=0.00..4789.34rows=500 width=390)  Filter: (subplan)  SubPlan    ->  Index Scan using idx_fk_film_id on
film_actor
(cost=0.00..28.35 rows=6 width=2)          Index Cond: ($0 = film_id)
(5 rows)

postgres=# explain analyze select * from film f where not exists
(select film_id from film_actor where f.film_id = film_id);                                                      QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------Hash
AntiJoin  (cost=149.90..240.24 rows=34 width=390) (actual
 
time=25.473..28.169 rows=3 loops=1)  Hash Cond: (f.film_id = film_actor.film_id)  ->  Seq Scan on film f
(cost=0.00..65.00rows=1000 width=390)
 
(actual time=0.027..1.898 rows=1000 loops=1)  ->  Hash  (cost=81.62..81.62 rows=5462 width=2) (actual
time=24.398..24.398 rows=5462 loops=1)        ->  Seq Scan on film_actor  (cost=0.00..81.62 rows=5462
width=2) (actual time=0.035..12.400 rows=5462 loops=1)Total runtime: 28.866 ms

postgres=# explain analyze select * from film f where not exists
(select film_id from film_actor where f.film_id = film_id);
QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------Seq
Scanon film f  (cost=0.00..4789.34 rows=500 width=390) (actual
 
time=5.874..22.654 rows=3 loops=1)  Filter: (NOT (subplan))  SubPlan    ->  Index Scan using idx_fk_film_id on
film_actor
(cost=0.00..28.35 rows=6 width=2) (actual time=0.016..0.016 rows=1
loops=1000)          Index Cond: ($0 = film_id)Total runtime: 22.835 ms
(6 rows)

postgres=# explain analyze select * from film f where film_id in
(select film_id from film_actor);                                                         QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------Hash
Join (cost=117.01..195.42 rows=966 width=390) (actual
 
time=43.151..53.688 rows=997 loops=1)  Hash Cond: (f.film_id = film_actor.film_id)  ->  Seq Scan on film f
(cost=0.00..65.00rows=1000 width=390)
 
(actual time=0.021..6.765 rows=1000 loops=1)  ->  Hash  (cost=104.94..104.94 rows=966 width=2) (actual
time=43.091..43.091 rows=997 loops=1)        ->  HashAggregate  (cost=95.28..104.94 rows=966 width=2)
(actual time=34.754..36.275 rows=997 loops=1)              ->  Seq Scan on film_actor  (cost=0.00..81.62 rows=5462
width=2) (actual time=0.024..15.746 rows=5462 loops=1)Total runtime: 55.291 ms

postgres=# explain analyze select * from film f where film_id in
(select film_id from film_actor);                                                            QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------Nested
LoopIN Join  (cost=0.00..175.25 rows=986 width=390) (actual
 
time=0.090..14.272 rows=997 loops=1)  ->  Seq Scan on film f  (cost=0.00..65.00 rows=1000 width=390)
(actual time=0.014..1.877 rows=1000 loops=1)  ->  Index Scan using idx_fk_film_id on film_actor  (cost=0.00..0.54
rows=6 width=2) (actual time=0.007..0.007 rows=1 loops=1000)        Index Cond: (film_actor.film_id = f.film_id)Total
runtime:15.902 ms
 
(5 rows)

8.4
postgres=# explain analyze select * from film f where film_id not in
(select film_id from film_actor);                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------Seq
Scanon film f  (cost=95.28..162.78 rows=500 width=390) (actual
 
time=24.324..26.015 rows=3 loops=1)  Filter: (NOT (hashed subplan))  SubPlan    ->  Seq Scan on film_actor
(cost=0.00..81.62rows=5462 width=2)
 
(actual time=0.026..12.074 rows=5462 loops=1)Total runtime: 26.201 ms
(5 rows)

8.3
postgres=# explain analyze select * from film f where film_id not in
(select film_id from film_actor);                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------Seq
Scanon film f  (cost=95.28..162.78 rows=500 width=390) (actual
 
time=29.713..30.456 rows=3 loops=1)  Filter: (NOT (hashed subplan))  SubPlan    ->  Seq Scan on film_actor
(cost=0.00..81.62rows=5462 width=2)
 
(actual time=0.051..13.649 rows=5462 loops=1)Total runtime: 30.644 ms
(5 rows)


8.4 is 10% faster than 8.3 on small table < 1000 rows. 8.4 has much
better prediction.

Regards
Pavel Stehule

2008/8/17 Tom Lane <tgl@sss.pgh.pa.us>:
> If you're still interested in testing CVS HEAD's handling of EXISTS,
> I've about finished what I wanted to do with it.
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Patch: plan invalidation vs stored procedures
Next
From: "Pavel Stehule"
Date:
Subject: Re: Patch: plan invalidation vs stored procedures