> In the "after" case you are showing "18105XS" as the most common
> actor_id, with a frequency of 11.2% of the entries. Where'd that
> come from? Is it correct?
I believe this is correct, and the reason I've not been getting poor
performance on the old database is that the stats are not up to date on the
*old* DB. I've been so focused on problems with the updated DB that I didn't
suspect that the database that was performing well had the "bad" stats.
> 7.1 basically uses stacommonfrac as the selectivity estimate if the
> constant is the stacommonval, otherwise stacommonfrac/10. So if
> stacommonfrac goes up, so does the estimated number of rows retrieved,
> and that's what's changing your plan.
I understand. That piece of info put us on the track of a solution.
In this case, we've got an unusual distribution that looks like this:
Among the actors to which cases may be assigned:
The State gets 10% of the cases
8 Judges get 3.5% of the cases each
50 Attorneys get about 0.1% each
The remaining 388,000 actors get about 0.001% each.
Given this unusual distribution, the planner can't predict well, so we're
thinking that the best way to handle this is to set up a script to do our
vacuum analyze, and then update stacommonfrac to be .01 for this particular
field.
This should give us great performance for the vast majority of the possible
queries. It looks like the 8 judges are on the borderline, and we'll get
poor performance if anyone happens to query on the State, but most of our
users should understand that would be a silly query anyway.
Is there another more graceful way to do this?
Tom- Thanks for your patient help with this. We'll be sure to try this
without the planner tweaks when we upgrade to 7.2 & let you know how it
goes.
regards,
-Nick