Tom Lane wrote:
> John Meinel <john@johnmeinel.com> writes:
>
>>I was looking into another problem, and I found something that surprised
>>me. If I'm doing "SELECT * FROM mytable WHERE col = 'myval' LIMIT 1.".
>>Now "col" is indexed, by mytable has 500,000 rows, and 'myval' occurs
>>maybe 100,000 times. Without the LIMIT, this query should definitely do
>>a sequential scan.
>
>
>>But with the LIMIT, doesn't it know that it will return at max 1 value,
>>and thus be able to use the index?
>
>
> But the LIMIT will cut the cost of the seqscan case too. Given the
> numbers you posit above, about one row in five will have 'myval', so a
> seqscan can reasonably expect to hit the first matching row in the first
> page of the table. This is still cheaper than doing an index scan
> (which must require reading at least one index page plus at least one
> table page).
>
> The test case you are showing is probably suffering from nonrandom
> placement of this particular data value; which is something that the
> statistics we keep are too crude to detect.
>
> regards, tom lane
You are correct about non-random placement. I'm a little surprised it
doesn't change with values, then. For instance,
# select count(*) from finst_t where store_id = 52;
13967
Still does a sequential scan for the "select id from..." query.
The only value it does an index query for is 9605 which only has 1 row.
It estimates ~18,000 rows, but that is still < 3% of the total data.
This row corresponds to disk location where files can be found. So when
a storage location fills up, generally a new one is created. This means
that *generally* the numbers will be increasing as you go further in the
table (not guaranteed, as there are multiple locations open at any one
time).
Am I better off in this case just wrapping my query with:
set enable_seqscan to off;
query
set enable_seqscan to on;
There is still the possibility that there is a better way to determine
existence of a value in a column. I was wondering about something like:
SELECT 1 WHERE EXISTS
(SELECT id FROM finst_t WHERE store_id=52 LIMIT 1);
Though the second part is the same, so it still does the sequential scan.
This isn't critical, I was just trying to understand what's going on.
Thanks for your help.
John
=:->