On 12/21/15 8:36 AM, Tom Lane wrote:
>> but it might be nice to have some number as to how
>> > reliable a certain estimate is, which is high if the estimate is, say,
>> > derived from a single filter on a base table and sinks as more conditions
>> > are involved or numbers pulled out of thin air.
> That might indeed be a useful thing to try to do, though I confess I'm
> not quite sure what we'd do with such numbers if we had them. It seems
> like the next thing would be to replace single cost estimates for plans
> with cost ranges, but then how do you compare a couple of overlapping
> ranges?
I suspect that doesn't happen often enough to be a big loss if we just
compare the midpoints in those cases, at least for a first pass. Beyond
that, I think it'd need to be handled on a per-node basis. Nodes that
depend heavily on having low row counts (like nested loop) would want to
heavily penalize the high side of the range. Nodes that aren't as
sensitive to that might not do that. There could possibly be some nodes
that actually penalize the low side of the range.
BTW, I suspect rather than a simple range we might want to pass the
exactly computed estimate as well as the estimated upper and lower error
margin for the estimate. An estimate of 827 +0 -400 could have very
different meaning than an estimate of [427,827].
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com