Re: sql query not using indexes - Mailing list pgsql-sql

From Tom Lane
Subject Re: sql query not using indexes
Date
Msg-id 29455.969641492@sss.pgh.pa.us
Whole thread Raw
In response to Re: sql query not using indexes  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Responses Re: sql query not using indexes  (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
List pgsql-sql
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
>> Ok I agree with you on the real database there are 127,300 rows and there
>> are certanly a great number of rows > 'AAA'. But, supose I make a query
>> select * from table where code > 'AAA' limit 10. it will read the entire
>> table only to give me the first 10 while in release 6.5 it will fetch the
>> index for the first 10 in a very fast manner, indeed the 6.5 release
>> resolves in 1 second while the 7.0 release resolves in 10-20 sec.

> Hmm, I believe Tom Lane was doing alot of stuff with the optimizer and
> limit but I don't remember if that was before or after the 7.0 release.
> It might be worth trying on current sources to see if that goes back to
> an index scan.

No, it'll still do a seqscan.  6.5 was in fact too ready to use
indexscans; the current code may have overcorrected a shade, but I think
it's closer to reality than 6.5 was.

As Hiroshi already commented, the difference in results suggests that
the desired data is very nonuniformly scattered in the table.  7.0
computes cost estimates on the assumption that the target data is
uniformly scattered.  For a sufficiently nonselective WHERE condition
(ie, one that the planner thinks will match a large fraction of the
table's rows) it looks better to do a seqscan and pick up the matching
rows than to follow the index pointers.  Adding a LIMIT doesn't change
this equation.

I like Hiroshi's recommendation: add an ORDER BY to help favor the
indexscan.
        regards, tom lane


pgsql-sql by date:

Previous
From: "Josh Berkus"
Date:
Subject: Re: [Fwd: Re: no ORDER BY in subselects?]
Next
From: Stephan Szabo
Date:
Subject: Re: sql query not using indexes