bad planner pick... but why? - Mailing list pgsql-novice
From | |
---|---|
Subject | bad planner pick... but why? |
Date | |
Msg-id | 03b301c6f13b$caaf7a20$6501a8c0@iwing Whole thread Raw |
Responses |
Re: bad planner pick... but why?
|
List | pgsql-novice |
hi list please have a look at this two queries: ---------------------------------------- SELECT mov_id FROM oldtables.movies LEFT JOIN oldtables.content ON movies.mov_id = content.c_m_id WHERE mov_id IN (SELECT DISTINCT rel_movieid FROM infos.rel_persons WHERE rel_personid = 40544) plan: Hash IN Join (cost=205.04..14030.40 rows=1 width=4) (actual time=105.568..277.782 rows=1 loops=1) Hash Cond: ("outer".mov_id = "inner".rel_movieid) -> Hash Left Join (cost=184.00..13824.49 rows=36973 width=4) (actual time=7.563..260.627 rows=36997 loops=1) Hash Cond: ("outer".mov_id = "inner".c_m_id) -> Seq Scan on movies (cost=0.00..13140.73 rows=36973 width=4) (actual time=0.070..206.254 rows=36997 loops=1) -> Hash (cost=174.60..174.60 rows=3760 width=4) (actual time=7.467..7.467 rows=3760 loops=1) -> Seq Scan on content (cost=0.00..174.60 rows=3760 width=4) (actual time=0.015..4.729 rows=3760 loops=1) -> Hash (cost=21.03..21.03 rows=1 width=4) (actual time=0.113..0.113 rows=1 loops=1) -> Unique (cost=20.99..21.02 rows=1 width=4) (actual time=0.108..0.109 rows=1 loops=1) -> Sort (cost=20.99..21.01 rows=6 width=4) (actual time=0.106..0.107 rows=1 loops=1) Sort Key: rel_persons.rel_movieid -> Index Scan using rel_persons_personid_idx on rel_persons (cost=0.00..20.92 rows=6 width=4) (actual time=0.094..0.098 rows=1 loops=1) Index Cond: (rel_personid = 40544) Total runtime: 277.901 ms ---------------------------------------- SELECT mov_id FROM oldtables.movies LEFT JOIN oldtables.content ON movies.mov_id = content.c_m_id WHERE c_m_id IN (SELECT DISTINCT rel_movieid FROM infos.rel_persons WHERE rel_personid = 40544) plan: Nested Loop (cost=20.99..32.69 rows=1 width=4) (actual time=0.169..0.175 rows=1 loops=1) -> Nested Loop (cost=20.99..27.05 rows=1 width=8) (actual time=0.158..0.162 rows=1 loops=1) -> Unique (cost=20.99..21.02 rows=1 width=4) (actual time=0.114..0.116 rows=1 loops=1) -> Sort (cost=20.99..21.01 rows=6 width=4) (actual time=0.113..0.113 rows=1 loops=1) Sort Key: rel_persons.rel_movieid -> Index Scan using rel_persons_personid_idx on rel_persons (cost=0.00..20.92 rows=6 width=4) (actual time=0.098..0.102 rows=1 loops=1) Index Cond: (rel_personid = 40544) -> Index Scan using movies_mov_id_idx on movies (cost=0.00..6.00 rows=1 width=4) (actual time=0.039..0.040 rows=1 loops=1) Index Cond: ("outer".rel_movieid = movies.mov_id) -> Index Scan using content_pkey on content (cost=0.00..5.63 rows=1 width=4) (actual time=0.008..0.009 rows=1 loops=1) Index Cond: ("outer".mov_id = content.c_m_id) Total runtime: 0.295 ms ---------------------------------------- query #1 is factor 1000 slower, because the two tables "movies" (~40k entries) and "content" (~30k entries) seem to be joined prior to filtering by the IN (....). any ideas why the planer decides not to first evaluate the IN (...) statement in the first case? here's the plan for the IN (...) subselect: ---------------------------------------- SELECT DISTINCT rel_movieid FROM infos.rel_persons WHERE rel_personid = 40544 plan: Unique (cost=20.99..21.02 rows=1 width=4) (actual time=0.112..0.113 rows=1 loops=1) -> Sort (cost=20.99..21.01 rows=6 width=4) (actual time=0.110..0.110 rows=1 loops=1) Sort Key: rel_movieid -> Index Scan using rel_persons_personid_idx on rel_persons (cost=0.00..20.92 rows=6 width=4) (actual time=0.095..0.098 rows=1 loops=1) Index Cond: (rel_personid = 40544) Total runtime: 0.155 ms ---------------------------------------- its pgsql 8.1 thanks, thomas
pgsql-novice by date: