I RTFM but I'm still confused. I have a table, headers
> mab=> \d headers
> Table "headers"
> Attribute | Type | Modifier
> -----------+---------+---------------------------------------------------------
> header_id | integer | not null default nextval('headers_header_id_seq'::text)
> part_id | integer | not null
> key | text |
> value | text |
> Index: headers_pkey
with 13 million rows:
> mab=> select count(*) from headers;
> count
> ----------
> 13411618
> (1 row)
I have indexed the table by part_id:
> mab=> \d headers_ref_idx
> Index "headers_ref_idx"
> Attribute | Type
> -----------+---------
> part_id | integer
> btree
And I have just run `vacuum analyze'. But the index isn't used:
> mab=> EXPLAIN SELECT * FROM headers WHERE part_id = 10;
> NOTICE: QUERY PLAN:
>
> Seq Scan on headers (cost=100000000.00..100361471.22 rows=22 width=32)
>
> EXPLAIN
The full table scan is extremely expensive (over 5 minutes wall clock
time). `set enable_seqscan = off' doesn't cause the index to be used
either:
> mab=> set enable_seqscan = off;
> SET VARIABLE
> mab=> explain SELECT * FROM headers WHERE part_id = 10;
> NOTICE: QUERY PLAN:
>
> Seq Scan on headers (cost=100000000.00..100361471.22 rows=22 width=32)
>
> EXPLAIN
According to the archives, the query planner decides whether to use an
index on a column based on the frequency of the most common value for
that column. But that's only ~200 rows, out of 13 million:
> mab=> SELECT part_id, count(*) AS count FROM headers GROUP BY part_id ORDER BY count DESC LIMIT 10;
> part_id | count
> ---------+-------
> 561415 | 219
> 114157 | 219
> 561414 | 219
> 114158 | 215
> 561418 | 215
> 561421 | 215
> 558872 | 74
> 558869 | 67
> 141780 | 62
> 202113 | 60
> (10 rows)
You'd think the index would still be cheaper. It's almost like it
doesn't exist. I tried removing and recreating it (and running vacuum
analyze again), but that didn't help.
(This is Postgres 7.1.2 under FreeBSD 4.4.)