mlw wrote:
> Now, given the choice of the two strategies on a table, both pretty close to
> one another, the risk of poor performance for using the index scan is minimal
> based on the statistics, but the risk of poor performance for using the
> sequential scan is quite high on a large table.
Wow, what did I start here?
OK, let me see if I can explain why the idea of an index being present
is not significant, and also explain why doing a sequential scan is
_less_ risky than a index scan.
First, if an admin creates an index, it does mean he thinks it will
help, but is he right? You could say that if they create the index, use
it, and if the admin finds it makes the query slower, he can then remove
it, and this does give him some control over the optimizer.
However, this assumes two things. First, it assumes the admin will
actually check to see if the index helps, and if it doesn't remove it,
but more importantly, it assumes there is only one type of query for
that table. That is the biggest fallacy. If I do:
SELECT * FROM tab WHERE col = 0;
I may be selecting 70% of the table, and an index scan will take
forever if 70% of the table (plus index pages) is significancy larger
than the cache size; every row lookup will have to hit the disk!
However, if I do:
SELECT * FROM tab WHERE col = 89823;
and 89823 is a rare value, perhaps only one row in the table, then an
index would be good to use, so yes, indexes can be added by admins to
improve performance, but the admin is creating the index probably for
the second query, and certainly doesn't want the index used for the
first query.
Also, these are simple queries. Add multiple tables and join methods,
and the idea that an admin creating an index could in any way control
these cases is implausible.
My second point, that index scan is more risky than sequential scan, is
outlined above. A sequential scan reads each page once, and uses the
file system read-ahead code to prefetch the disk buffers. Index scans
are random, and could easily re-read disk pages to plow through a
significant portion of the table, and because the reads are random,
the file system will not prefetch the rows so the index scan will have
to wait for each non-cache-resident row to come in from disk.
-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026