Cost of sort/order by not estimated by the query planner - Mailing list pgsql-hackers
From | Laurent Laborde |
---|---|
Subject | Cost of sort/order by not estimated by the query planner |
Date | |
Msg-id | 8a1bfe660911300854h63ffb39anfeb667da662cb37c@mail.gmail.com Whole thread Raw |
Responses |
Re: Cost of sort/order by not estimated by the query planner
(Laurent Laborde <kerdezixe@gmail.com>)
Re: Cost of sort/order by not estimated by the query planner (Hitoshi Harada <umi.tanuki@gmail.com>) |
List | pgsql-hackers |
Friendly greetings ! I use postgresql 8.3.6. here is a few info about the table i'm querying : ------------------------------------------------------------- - select count(*) from _article : 17301610 - select count(*) from _article WHERE (_article.bitfield && getbit(0)) : 6729 Here are both request with problems : -------------------------------------------------- QUERY 1 : Very fast ! ------------- explain SELECT * FROM _article WHERE (_article.bitfield && getbit(0)) ORDER BY _article.id ASC LIMIT 500; QUERY PLAN -----------------------------------------------------------------------------------------------------Limit (cost=66114.13..66115.38rows=500 width=1114) -> Sort (cost=66114.13..66157.37 rows=17296 width=1114) Sort Key:id -> Bitmap Heap Scan on _article (cost=138.32..65252.29 rows=17296 width=1114) Recheck Cond: (bitfield && B'1'::bit varying) -> Bitmap Index Scan on idx_article_bitfield (cost=0.00..134.00 rows=17296 width=0) Index Cond: (bitfield && B'1'::bit varying) QUERY 2 : Endless ... (more than 30mn... i stopped the query) ------------- explain SELECT * FROM _article WHERE (_article.bitfield && getbit(0)) ORDER BY _article.id ASC LIMIT 5; QUERY PLAN -------------------------------------------------------------------------------------------------Limit (cost=0.00..2042.87rows=5 width=1114) -> Index Scan using _article_pkey on _article (cost=0.00..7066684.46 rows=17296 width=1114) Filter: (bitfield && B'1'::bit varying) (3 rows) With LIMIT 5 and LIMIT 500, the query plan are differents. Postgresql estimate that it can do a a simple index scan to find only 5 row. With more than LIMIT ~400 it estimate that it's faster to do a more complex plan. and it make sense ! The problem is in the order by, of course. If i remove the "order by" the LIMIT 5 is faster (0.044 ms) and do an index scan. At limit 500 (without order) it still use an index scan and it is slightly slower. At limit 5000 (without order) it switch to a Bitmap Index Scan + Bitmap Heap Scan and it's slower but acceptable (5.275 ms) Why, with the "QUERY 2", postgresql doesn't estimate the cost of the Sort/ORDER BY ? Of course, by ignoring the order, both query plan are right and the choice for thoses differents plans totally make sense. But... if the planner would be kind enough to considerate the cost of the order by, it would certainly choose the Bitmap Index + Bitmap Heap scan for the limit 5. And not an index_scan pkey ! I have set the statistics to 1000 for _article.bitfield, just in case (and ran a vacuum analyze), it doesn't change anything. Is that a bug ? any Idea ? Thank you :) -- Laurent "ker2x" Laborde Sysadmin & DBA at http://www.over-blog.com/
pgsql-hackers by date: