Re: Seq scans on indexed columns. - Mailing list pgsql-performance
From | Shane Ambler |
---|---|
Subject | Re: Seq scans on indexed columns. |
Date | |
Msg-id | 478BB2C2.7040901@Sheeky.Biz Whole thread Raw |
In response to | Seq scans on indexed columns. (Yannick Le Guédart <yannick@over-blog.com>) |
List | pgsql-performance |
Yannick Le Guédart wrote: > Greetings, > > I was trying to get informations on #portgresql about a query plan I > think is quite strange, and I was said to post on this list. I hope my > mail will be clear enough. I have included the query, the query plan, > and the table definitions. I just don't understand the "Seq Scan" on > fileds that are indexed. > > Thanks in advance > > Yannick First off what sort of response times are you getting with the query? or are you just after an understanding of why it plans that way? pg version? What sort of row counts do you have in the tables? Some of the steps show it expecting to get over 2M rows being returned. How many rows are returned without the limit 5? Is it close to what you would expect? From what I see the first seq scan (role_id=5) can't use an index as role_id is only the third element of an index, it would need to be matching all three on the condition to use it. But I don't expect that to make much difference (time wise) if any. The first nested loop has the biggest cost estimate difference and I would expect that to be closer to what you are looking for. My guess is that the joins are not producing the result you expect - start without the joins (and the joined columns) and add one at a time to see which is causing the problem. Then find a better way to join the data. http://www.postgresql.org/docs/8.2/interactive/explicit-joins.html may give you some ideas... > Here we go : > > > -------------------------------- > > QUERY > > SELECT > _article.*, > ( > SELECT COUNT (id) > FROM _comment > WHERE parent_id = _article.id > ) AS nb_comments, > _blog.id as blog_id, > _blog.name as blog_name, > xpath_string(_blog.data,'/data/title') as blog_title, > _blog.reference as blog_ref, > _blog.main_host as blog_main_host, > _user.id as user_id, > _user.reference as user_ref, > _user.nickname as user_nickname > > FROM _article > > INNER JOIN _blog > ON _article.path <@ _blog.path > INNER JOIN _entity_has_element > ON _entity_has_element.element_id = _blog.id > INNER JOIN _user > ON _user.id = _entity_has_element.entity_id > AND _entity_has_element.role_id = 5 > > WHERE _article.id IN > ( > SELECT _relation.destination_id AS id > FROM _relation > WHERE _relation.parent_id = 1008109112 > ) > AND _article.date_publishing < now () > > ORDER BY nb_comments DESC > OFFSET 0 > LIMIT 5 > > -------------------------------- > QUERY PLAN > Limit (cost=378253213.46..378253213.47 rows=5 width=1185) > -> Sort (cost=378253213.46..378260027.29 rows=2725530 width=1185) > Sort Key: (subplan) > -> Hash Join (cost=4907270.58..375534454.12 rows=2725530 width=1185) > Hash Cond: (_entity_has_element.element_id = _blog.id) > -> Hash Join (cost=220747.87..260246.60 rows=543801 width=32) > Hash Cond: (_entity_has_element.entity_id = _user.id) > -> Seq Scan on _entity_has_element (cost=0.00..19696.96 rows=543801 width=16) > Filter: (role_id = 5) > -> Hash (cost=205537.72..205537.72 rows=806972 width=24) > -> Seq Scan on _user (cost=0.00..205537.72 rows=806972 width=24) > -> Hash (cost=4309146.55..4309146.55 rows=2388333 width=1161) > -> Nested Loop (cost=8782.56..4309146.55 rows=2388333 width=1161) > -> Nested Loop (cost=8689.45..43352.96 rows=5012 width=1073) > -> HashAggregate (cost=8689.45..8740.05 rows=5060 width=8) > -> Bitmap Heap Scan on _relation (cost=124.98..8674.40 rows=6021 width=8) > Recheck Cond: (parent_id = 1008109112) > -> Bitmap Index Scan on idx_relation_parent_id (cost=0.00..123.47 rows=6021 width=0) > Index Cond: (parent_id = 1008109112) > -> Index Scan using _article_pkey on _article (cost=0.00..6.83 rows=1 width=1073) > Index Cond: (_article.id = _relation.destination_id) > Filter: (date_publishing < now()) > -> Bitmap Heap Scan on _blog (cost=93.11..845.15 rows=477 width=114) > Recheck Cond: (_article.path <@ _blog.path) > -> Bitmap Index Scan on gist_idx_blog_path (cost=0.00..92.99 rows=477 width=0) > Index Cond: (_article.path <@ _blog.path) > SubPlan > -> Aggregate (cost=135.81..135.82 rows=1 width=8) > -> Index Scan using idx_comment_parent_id on _comment (cost=0.00..135.61 rows=79 width=8) > Index Cond: (parent_id = $0) > > -------------------- > -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz
pgsql-performance by date: