On Wed, 16 Jun 2021 at 15:08, Peter Geoghegan <pg@bowt.ie> wrote:
> It seems important to distinguish between risk and uncertainty --
> they're rather different things. The short version is that you can
> model risk but you cannot model uncertainty. This seems like a problem
> of uncertainty to me.
You might be right there. "Uncertainty" or "Certainty" seems more
like a value that clauselist_selectivity() would be able to calculate
itself. It would just be up to the planner to determine what to do
with it.
One thing I thought about is that if the costing modal was able to
separate out a cost of additional (unexpected) rows then it would be
easier for add_path() to take into account how bad things might go if
we underestimate.
For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows. With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap. I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.
David