On Fri, 05 Nov 2004 23:04:23 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Matt Clark <matt@ymogen.net> writes:
> > Well, 74000/76000000 ~= 0.1%, way less than 1/26, so no surprise that an
> > indexscan is better, and also no surprise that the planner can't know
> > that I is such an uncommon initial char.
>
> But it *can* know that, at least given adequate ANALYZE statistics.
> I'm pretty convinced that the basic answer to Allen's problem is to
> increase the histogram size. How large he needs to make it is not
> clear --- obviously his data distribution is not uniform, but I don't
> have a fix on how badly non-uniform.
>
Tom just an update, it's now 2am.. several hours since I started that
EXPLAIN ANALYZE and it still hasn't finished, so I've aborted it. I
will do the example with the more precise substring instead to
illustrate the performance differences, both with and without the
substring index and report back here.
I'm also interested in something someone else posted, namely that the
36 indexes I have, "A%" through "Z%" and "0%" through "9%" could be
replaced with a single index like:
"CREATE INDEX idx_table_field_substr ON table substr(field, 1, 1);"
I'm wondering, histogram and other information aside, will this
function as well (or better) than creating all the individual indexes?