Index not being used in sorting of simple table - Mailing list pgsql-performance

From Paul Smith
Subject Index not being used in sorting of simple table
Date
Msg-id VPOP32.5.0.20070504153630.092.309f.1.f0e0d0d3@lmail.pscs.co.uk
Whole thread Raw
Responses Re: Index not being used in sorting of simple table  (Heikki Linnakangas <heikki@enterprisedb.com>)
Re: Index not being used in sorting of simple table  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
This is in Postgres 8.1.5

I have a table like
CREATE TABLE x (a VARCHAR, b VARCHAR, c VARCHAR);
CREATE INDEX y on x(a);
CREATE INDEX z on x(b);

There are over a million rows in 'x'. Neither a nor b are unique.
There are probably about 20 or so distinct values of a and 30 or so
distinct values of b

I've done a 'vacuum analyze' first.

If I do
EXPLAIN SELECT * FROM x ORDER BY a;
it says
  Index Scan using y on x  (cost=0.00..2903824.15 rows=1508057 width=152)

That's what I'd expect

However, if I do
EXPLAIN SELECT * FROM x ORDER BY b;
it says
Sort  (cost=711557.34..715327.48 rows=1508057
width=152)
    Sort Key:
b
    ->  Seq Scan on x  (cost=0.00..53203.57 rows=1508057 width=152)

Why doesn't it use the other index? If use 'set enable_seqscan=0' then it does.

I tried using EXPLAIN ANALYZE to see how long it actually took:
- seq scan - 75 secs
- index scan - 13 secs
- seq scan - 77 secs
(I tried the seq scan version after the index scan as well to see if
disk caching was a factor, but it doesn't look like it)

If I do something like SELECT * FROM x WHERE b='...'; then it does
use the index , it's just for ordering it doesn't seem to. (Yes, it's
a BTREE index, not a hash index)

Oh, and if I use
EXPLAIN SELECT * FROM x ORDER BY b LIMIT 100000;
then it uses the index scan, not the seq scan.
If I use
EXPLAIN SELECT * FROM x ORDER BY b LIMIT 1000000;
it uses the seq scan again, so I can't just set an arbitrarily big
limit to use the index.

Any ideas? To me it looks like a bug in the planner. I can't think of
any logical reason not to use an existing index to retrieve a sorted
listing of the data.

Paul                            VPOP3 - Internet Email Server/Gateway
support@pscs.co.uk                      http://www.pscs.co.uk/



pgsql-performance by date:

Previous
From: Michael Stone
Date:
Subject: Re: pg_stat_* collection
Next
From: Scott Marlowe
Date:
Subject: Re: Query performance problems with partitioned tables