Re: force the use of a particular index - Mailing list pgsql-performance
From | Scott Cain |
---|---|
Subject | Re: force the use of a particular index |
Date | |
Msg-id | 1057937779.5767.17.camel@localhost.localdomain Whole thread Raw |
In response to | Re: force the use of a particular index (Rod Taylor <rbt@rbt.ca>) |
Responses |
Re: force the use of a particular index
|
List | pgsql-performance |
Rod, I see what you mean about the scan on the feature_pkey taking a long time. I tried several things to remedy that. I created an index on feature (feature_id,type_id) (which I don't think makes sense since feature_id is the primary key, so add another column really doesn't help). I also created a index on feature (type_id, feature_id), but the planner doesn't use it. Also, there was an already existing index on feature (type_id) that the planner never used. One thing I tried that changed the query plan and improved performance slightly (but still nowhere near what I need) was to add a partial index on featureloc on (fmin,fmax) where scrfeature_id=6. This is something I could realistically do since there are relatively few (>30) srcfeature_ids that I am interested in, so putting in place a partial index for each of them would not be a big deal. Nevertheless, the performance is still not there. Here is the EXPLAIN ANALYZE for this situation: QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Unique (cost=156172.23..156200.11 rows=159 width=54) (actual time=63631.93..63631.93 rows=1 loops=1) -> Sort (cost=156172.23..156176.21 rows=1594 width=54) (actual time=63631.93..63631.93 rows=1 loops=1) Sort Key: f.name, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id -> Hash Join (cost=135100.30..156087.46 rows=1594 width=54) (actual time=63631.29..63631.79 rows=1 loops=1) Hash Cond: ("outer".feature_id = "inner".feature_id) -> Index Scan using featureloc_src_6 on featureloc fl (cost=0.00..18064.99 rows=101883 width=14) (actualtime=26.11..430.00 rows=570 loops=1) Index Cond: ((fmin <= 2585581) AND (fmax >= 2565581)) Filter: (srcfeature_id = 6) -> Hash (cost=134601.43..134601.43 rows=48347 width=40) (actual time=63182.86..63182.86 rows=0 loops=1) -> Index Scan using feature_pkey on feature f (cost=0.00..134601.43 rows=48347 width=40) (actual time=69.98..62978.27rows=13825 loops=1) Filter: (type_id = 219) Total runtime: 63632.28 msec (12 rows) Any other ideas? Thanks, Scott On Fri, 2003-07-11 at 09:38, Rod Taylor wrote: > On Fri, 2003-07-11 at 13:17, Scott Cain wrote: > > The problem (at least as it appears to me) is not that it is performing > > a table scan instead of an index scan, it is that it is using the wrong > > index. Here is the output from EXPLAIN ANALYZE: > > > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Unique (cost=494008.47..494037.59 rows=166 width=54) (actual time=114660.37..114660.38 rows=1 loops=1) > > -> Sort (cost=494008.47..494012.63 rows=1664 width=54) (actual time=114660.37..114660.37 rows=1 loops=1) > > Sort Key: f.name, fl.fmin, fl.fmax, fl.strand, f.type_id, f.feature_id > > -> Nested Loop (cost=0.00..493919.44 rows=1664 width=54) (actual time=2596.13..114632.90 rows=1 loops=1) > > -> Index Scan using feature_pkey on feature f (cost=0.00..134601.43 rows=52231 width=40) (actual time=105.74..56048.87rows=13825 loops=1) > > Filter: (type_id = 219) > > -> Index Scan using featureloc_idx1 on featureloc fl (cost=0.00..6.87 rows=1 width=14) (actual time=4.23..4.23rows=0 loops=13825) > > Index Cond: ("outer".feature_id = fl.feature_id) > > Filter: ((srcfeature_id = 6) AND (fmin <= 2585581) AND (fmax >= 2565581)) > > Total runtime: 114660.91 msec > > > it is using on featureloc (featureloc_idx1) is on the foreign key > > feature_id. It should instead be using another index, featureloc_idx3, > > which is built on (srcfeature_id, fmin, fmax). > > Nope.. The optimizer is right in the decision to use featureloc_idx1. > You will notice it is expecting to retrieve a single row from this > index, but the featureloc_idx3 is bound to be larger (due to indexing > more data), thus take more disk reads for the exact same information (or > in this case, lack thereof). > > What is taking a long time is the scan on feature_pkey. It looks like it > is throwing away a ton of rows that are not type_id = 219. Either that, > or you do a pile of deletes and haven't run REINDEX recently. > > Create an index consisting of (feature_id, type_id). This will probably > make a significant different in execution time. -- ------------------------------------------------------------------------ Scott Cain, Ph. D. cain@cshl.org GMOD Coordinator (http://www.gmod.org/) 216-392-3087 Cold Spring Harbor Laboratory
pgsql-performance by date: