Centuries ago, Nostradamus foresaw when ronz@ravensfield.com (Andrew Rawnsley) would write:
> I would like, of course, for it to use the index, given that it
> takes 20-25% of the time. Fiddling with CPU_TUPLE_COST doesn't do
> anything until I exceed 0.5, which strikes me as a bit high (though
> please correct me if I am assuming too much...). RANDOM_PAGE_COST
> seems to have no effect. I suppose I could cluster it, but it is
> constantly being added to, and would have to be re-done on a daily
> basis (if not more).
>
> Any suggestions?
The apparent problem is a bad query plan, and for clustering to "fix"
it seems a disturbing answer.
A problem I saw last week with some query plans pointed to the issue
that the statistics were inadequate.
We had some queries where indexing on "customer" is extremely
worthwhile in nearly all cases, but it often wasn't happening. The
problem was that the 10 "bins" in the default stats table would
collect up stats about a few _highly_ active customers, and pretty
much ignore the less active ones. Because the "bins" were highly
dominated by the few common values, stats for the others were missing
and pretty useless.
I upped the size of the histogram from 10 to 100, and that allowed
stats to be kept for less active customers, GREATLY improving the
quality of the queries.
The point that falls out is that if you have a column which has a
bunch of discrete values (rather more than 10) that aren't near-unique
(e.g. - on a table with a million transactions, you have a only few
hundred customers), that's a good candidate for upping column stats.
Thus, you might try:
ALTER TABLE MY_TABLE ALTER COLUMN SOME_COLUMN SET STATISTICS 50;
ANALYZE MY_TABLE;
--
let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://www.ntlug.org/~cbbrowne/postgresql.html
"There's no longer a boycott of Apple. But MacOS is still a
proprietary OS." -- RMS - June 13, 1998