Optimizer not using index when it should - Mailing list pgsql-performance

From Rob Messer
Subject Optimizer not using index when it should
Date
Msg-id 20030429090301.13152.qmail@web41213.mail.yahoo.com
Whole thread Raw
Responses Re: Optimizer not using index when it should
List pgsql-performance
Like some other recent messages, this one is about getting postgresql
to use an index that it seems it should clearly use.  (But this one has
nothing to do with count(*)).

Here is my table:

    Column    |            Type             | Modifiers
--------------+-----------------------------+-----------
 dsid         | character varying(20)       | not null
 recid        | numeric(8,0)                | not null
 trans_id     | character varying(16)       | not null
 status       | character varying(1)        |
 init_ts      | timestamp without time zone |
 last_ts      | timestamp without time zone |
 last_form_id | character varying(8)        |
 secval       | character varying(20)       |
Indexes: ds_rec1 unique btree (recid),
         ds_rec2 btree (dsid)

Here is my version info:
 PostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2
20020903 (Red Hat Linux 8.0 3.2-7)

Currently the table ds_record has about 250,000 records in it.  Of
those, about 3000 have dsid = 'starz'. When I need to look up all the
recids with this dsid in ds_record, I have the following simple query:

select recid from ds_record where dsid = 'startz';

But it doesn't use the index ds_rec2 on dsid.  Here is the explain
analyze output:

intellis2=> explain analyze select recid from ds_record where dsid =
'starz';
                                                  QUERY PLAN

--------------------------------------------------------------------------------------------------------------
 Seq Scan on ds_record  (cost=0.00..6186.21 rows=3484 width=12) (actual
time=10.60..408.12 rows=3484 loops=1)
   Filter: (dsid = 'starz'::character varying)
 Total runtime: 410.14 msec
(3 rows)

but if I turn off seqscan I get this:

intellis2=> set enable_seqscan=off;
SET
intellis2=> explain analyze select recid from ds_record where dsid =
'starz';
                                                         QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------
 Index Scan using ds_rec2 on ds_record  (cost=0.00..7185.47 rows=3484
width=12)
(actual time=0.17..12.94 rows=3484 loops=1)
   Index Cond: (dsid = 'starz'::character varying)
 Total runtime: 14.97 msec
(3 rows)

so it is faster by more than a factor of 25 to use the index.  The
problem gets worse when I add a join to the table.

I have tried the following:

alter table ds_record alter dsid set statistics 1000;
vacuum analyze ds_record;
drop index ds_rec2;
CREATE INDEX ds_rec2 ON ds_record USING btree (dsid);

But to no avail, I get the same results.

Interestingly, for queries that return fewer rows it does use the
correct index.  For example, dsid="mapbuy2" appears about 500 times in
ds_record.  Here is the explain out there (with enable_seqscan back
on):

intellis2=> explain analyze select recid from ds_record where dsid =
'mapbuy2';
                                                       QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------
 Index Scan using ds_rec2 on ds_record  (cost=0.00..1351.17 rows=522
width=12) (actual time=0.18..4.31 rows=522 loops=1)
   Index Cond: (dsid = 'mapbuy2'::character varying)
 Total runtime: 4.68 msec

To me it seems that the threshold for doing a table scan is wrong --
when the rows retrieved are about 1.25% of the table it does a scan.

What can I do to fix this -- is there something I am missing about
setting statistics or some configuration variable I can change?  Any
insights would be greatly appreciated.  Thank you,

Rob Messer



__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com


pgsql-performance by date:

Previous
From: Shridhar Daithankar
Date:
Subject: Re: Is 292 inserts/sec acceptable performance ?
Next
From: Rajesh Kumar Mallah
Date:
Subject: Re: Is 292 inserts/sec acceptable performance ?