Optimizer not using index when it should - Mailing list pgsql-performance
From | Rob Messer |
---|---|
Subject | Optimizer not using index when it should |
Date | |
Msg-id | 20030429090301.13152.qmail@web41213.mail.yahoo.com Whole thread Raw |
Responses |
Re: Optimizer not using index when it should
|
List | pgsql-performance |
Like some other recent messages, this one is about getting postgresql to use an index that it seems it should clearly use. (But this one has nothing to do with count(*)). Here is my table: Column | Type | Modifiers --------------+-----------------------------+----------- dsid | character varying(20) | not null recid | numeric(8,0) | not null trans_id | character varying(16) | not null status | character varying(1) | init_ts | timestamp without time zone | last_ts | timestamp without time zone | last_form_id | character varying(8) | secval | character varying(20) | Indexes: ds_rec1 unique btree (recid), ds_rec2 btree (dsid) Here is my version info: PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7) Currently the table ds_record has about 250,000 records in it. Of those, about 3000 have dsid = 'starz'. When I need to look up all the recids with this dsid in ds_record, I have the following simple query: select recid from ds_record where dsid = 'startz'; But it doesn't use the index ds_rec2 on dsid. Here is the explain analyze output: intellis2=> explain analyze select recid from ds_record where dsid = 'starz'; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Seq Scan on ds_record (cost=0.00..6186.21 rows=3484 width=12) (actual time=10.60..408.12 rows=3484 loops=1) Filter: (dsid = 'starz'::character varying) Total runtime: 410.14 msec (3 rows) but if I turn off seqscan I get this: intellis2=> set enable_seqscan=off; SET intellis2=> explain analyze select recid from ds_record where dsid = 'starz'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Index Scan using ds_rec2 on ds_record (cost=0.00..7185.47 rows=3484 width=12) (actual time=0.17..12.94 rows=3484 loops=1) Index Cond: (dsid = 'starz'::character varying) Total runtime: 14.97 msec (3 rows) so it is faster by more than a factor of 25 to use the index. The problem gets worse when I add a join to the table. I have tried the following: alter table ds_record alter dsid set statistics 1000; vacuum analyze ds_record; drop index ds_rec2; CREATE INDEX ds_rec2 ON ds_record USING btree (dsid); But to no avail, I get the same results. Interestingly, for queries that return fewer rows it does use the correct index. For example, dsid="mapbuy2" appears about 500 times in ds_record. Here is the explain out there (with enable_seqscan back on): intellis2=> explain analyze select recid from ds_record where dsid = 'mapbuy2'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Index Scan using ds_rec2 on ds_record (cost=0.00..1351.17 rows=522 width=12) (actual time=0.18..4.31 rows=522 loops=1) Index Cond: (dsid = 'mapbuy2'::character varying) Total runtime: 4.68 msec To me it seems that the threshold for doing a table scan is wrong -- when the rows retrieved are about 1.25% of the table it does a scan. What can I do to fix this -- is there something I am missing about setting statistics or some configuration variable I can change? Any insights would be greatly appreciated. Thank you, Rob Messer __________________________________ Do you Yahoo!? The New Yahoo! Search - Faster. Easier. Bingo. http://search.yahoo.com
pgsql-performance by date: