On Fri, 23 Apr 2004 brad-pgperf@duttonbros.com wrote:
> I have a query which I think should be using an index all of the time but
> postgres only uses the index part of the time. The index
> (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed
> first followed by the selected column (support_person_id). Wouldn't the
> most efficient plan be to scan the index each time because the only columns
> needed are in the index? Below is the table, 2 queries showing the
Not necessarily. The rows in the actual file still need to be checked to
see if they're visible to the select and if it's expected that the entire
file (or a reasonable % of the pages anyway) will need to be loaded using
the index isn't necessarily a win.
> athenapost=> explain analyze select distinct support_person_id from
> ticket_crm_map where crm_id = 1;
> QUERY PLAN
> ----------------------------------------------------------------------------
> -----------------------------------------------------
> Unique (cost=10911.12..11349.26 rows=32 width=4) (actual
> time=659.102..791.517 rows=24 loops=1)
> -> Sort (cost=10911.12..11130.19 rows=87628 width=4) (actual
> time=659.090..713.285 rows=93889 loops=1)
> Sort Key: support_person_id
> -> Seq Scan on ticket_crm_map (cost=0.00..3717.25 rows=87628
> width=4) (actual time=0.027..359.299 rows=93889 loops=1)
> Filter: (crm_id = 1)
> Total runtime: 814.601 ms
How far off is this from the index scan version in time? Try doing
set enable_seqscan=off; and then explain analyzing again.
It's possible that you may wish to lower random_page_cost to change the
estimated effect of how much more expensive random reads are compared to
sequential ones.