Re: [PERFORM] Slow query: bitmap scan troubles - Mailing list pgsql-hackers
From | Jeff Janes |
---|---|
Subject | Re: [PERFORM] Slow query: bitmap scan troubles |
Date | |
Msg-id | CAMkU=1xKAGpT9-t3NkQO1=MRBaMudjZA2OKowSYJxACb7nvorQ@mail.gmail.com Whole thread Raw |
Responses |
Re: [PERFORM] Slow query: bitmap scan troubles
|
List | pgsql-hackers |
[moved to hackers]
On Wednesday, December 5, 2012, Tom Lane wrote:
On Wednesday, December 5, 2012, Tom Lane wrote:
Jeff Janes <jeff.janes@gmail.com> writes:
> I now see where the cost is coming from. In commit 21a39de5809 (first
> appearing in 9.2) the "fudge factor" cost estimate for large indexes
> was increased by about 10 fold, which really hits this index hard.
> This was fixed in commit bf01e34b556 "Tweak genericcostestimate's
> fudge factor for index size", by changing it to use the log of the
> index size. But that commit probably won't be shipped until 9.3.
Hm. To tell you the truth, in October I'd completely forgotten about
the January patch, and was thinking that the 1/10000 cost had a lot
of history behind it. But if we never shipped it before 9.2 then of
course that idea is false. Perhaps we should backpatch the log curve
into 9.2 --- that would reduce the amount of differential between what
9.2 does and what previous branches do for large indexes.
I think we should backpatch it for 9.2.3. I've seen another email which is probably due to the same issue (nested loop vs hash join). And some monitoring of a database I am responsible for suggests it might be heading in that direction as well as the size grows.
But I am wondering if it should be present at all in 9.3. When it was introduced, the argument seemed to be that smaller indexes might be easier to keep in cache. And surely that is so. But if a larger index that covers the same type of queries exists when a smaller one also exists, we can assume the larger one also exists for a reason. While it may be easier to keep a smaller index in cache, it is not easier to keep both a larger and a smaller one in cache as the same time. So it seems to me that this reasoning is a wash. (Countering this argument is that a partial index is more esoteric, and so if one exists it is more likely to have been well-thought out)
The argument for increasing the penalty by a factor of 10 was that the smaller one could be "swamped by noise such as page-boundary-roundoff behavior". I don't really know what that means, but it seems to me that if it is so easily swamped by noise, then it probably isn't so important in the first place which one it chooses. Whereas, I think that even the log based penalty has the risk of being too much on large indexes. (For one thing, it implicitly assumes the fan-out ratio at each level of btree is e, when it will usually be much larger than e.)
One thing which depends on the index size which, as far as I can tell, is not currently being counted is the cost of comparing the tuples all the way down the index. This would be proportional to log2(indextuples) * cpu_index_tuple_cost, or maybe log2(indextuples) * (cpu_index_tuple_cost+cpu_operator_cost), or something like that. This cost would depend on the number index tuples, not baserel tuples, and so would penalize large indexes. It would be much smaller than the current log(pages/10000) penalty, but it would be more principle-based rather than heuristic-based.
The log(pages/10000) change is more suitable for back-patching because it is more conservative, being asymptotic with the previous behavior at the low end. But I don't think that the case for that previous behavior was ever all that strong.
If we really want a heuristic to give a bonus to partial indexes, maybe we should explicitly give them a bonus, rather than penalizing ordinary indexes (which penalty is then used in comparing them to hash joins and such, not just partial indexes).
maybe something like bonus = 0.05 * (reltuples-indextuples)/reltuples
Cheers,
Jeff
pgsql-hackers by date: