Use of index in 7.0 vs 6.5 - Mailing list pgsql-sql
From | Ryan Bradetich |
---|---|
Subject | Use of index in 7.0 vs 6.5 |
Date | |
Msg-id | 392C6056.D5BF7FE8@hp.com Whole thread Raw |
Responses |
Re: Use of index in 7.0 vs 6.5
(Tom Lane <tgl@sss.pgh.pa.us>)
Re: Use of index in 7.0 vs 6.5 (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-sql |
Tom (Or anyone else who is good with PostgreSQL statistics), I am in the process of transitioning from postgreSQL 6.5.3 to postgreSQL 7.0. I ran into an issue where a sequential scan is being choosen on postgreSQL 7.0 where an index scan was choosen on postgreSQL 6.5.3. Note: All tables have been freshly vacuum'd and analyzed. procman=# select version(); version -------------------------------------------------------------------PostgreSQL 7.0.0 on hppa2.0w-hp-hpux11.00, compiled bygcc 2.95.2 (1 row) procman=# explain select count(catagory) from medusa where host_id = 404 and catagory like 'A%'; NOTICE: QUERY PLAN: Aggregate (cost=189546.19..189546.19 rows=1 width=12) -> Seq Scan on medusa (cost=0.00..189529.43 rows=6704 width=12) EXPLAIN Note: The above query produces an index scan on postgreSQL 6.5.3. procman=# set enable_seqscan = off; SET VARIABLE procman=# explain select count(catagory) from medusa where host_id = 404 and catagory like 'A%'; NOTICE: QUERY PLAN: Aggregate (cost=207347.36..207347.36 rows=1 width=12) -> Index Scan using medusa_host_id_key on medusa (cost=0.00..207330.60 rows=6704 width=12) EXPLAIN Here are the statistics: procman=# select attname,attdisbursion,s.* procman-# from pg_statistic s, pg_attribute a, pg_class c procman-# where starelid = c.oid and attrelid = c.oid and staattnum = attnum procman-# and relname = 'medusa'; attname | attdisbursion | starelid | staattnum | staop | stanullfrac | stacommonfrac | stacommonval |staloval | stahival -----------+---------------+----------+-----------+-------+-------------+---------------+-----------------------------------------------------------------------------+---------------------------- --------------------------------------+----------------------------------------- host_id | 0.00621312 | 30874288 | 1 | 97 | 0 | 0.0279425 | 446 | 0 | 11011 (1 row) Here is my analysis of the stastics (based on the examples in the archive). The most common value host_id in the table is 446 with row fraction of ~ 2.8%. The estimated number of rows in the index is 6704. This table has 4,630,229 entries in the table. Hopefully this analysis is correct, if not .. please correct me :) I do not understand why the planner would choose a seqscan over the index scan because 6704/4,630,229 is ~ 0.15%. Thanks for your time, Ryan - Ryan