On Mon, Jun 21, 2021 at 4:51 PM Bruce Momjian <bruce@momjian.us> wrote:
> There were a lot of interesting ideas in this thread and I want to
> analyze some of them. First, there is the common assumption (not
> stated) that over-estimating by 5% and underestimating by 5% cause the
> same harm, which is clearly false. If I go to a restaurant and estimate
> the bill to be 5% higher or %5 lower, assuming I have sufficient funds,
> under or over estimating is probably fine. If I am driving and
> under-estimate the traction of my tires, I am probably fine, but if I
> over-estimate their traction by 5%, I might crash.
My favorite analogy is the home insurance one:
It might make sense to buy home insurance because losing one's home
(say through fire) is a loss that usually just cannot be tolerated --
you are literally ruined. We can debate how likely it is to happen,
but in the end it's not so unlikely that it can't be ruled out. At the
same time I may be completely unwilling to buy insurance for personal
electronic devices. I can afford to replace all of them if I truly
have to. And the chances of all of them breaking or being stolen on
the same day is remote (unless my home burns down!). If I drop my cell
phone and crack the screen, I'll be annoyed, but it's certainly not
the end of the world.
This behavior will make perfect sense to most people. But it doesn't
scale up or down. I have quite a few electronic devices, but only a
single home, so technically I'm taking risks way more often than I am
playing it safe here. Am I risk tolerant when it comes to insurance?
Conservative?
I myself don't think that it is sensible to apply either term here.
It's easier to just look at the specifics. A home is a pretty
important thing to almost everybody; we can afford to treat it as a
special case.
> If that is accurate, I think the big question is how common are cases
> where the outer side can't be proven to have zero or one row and nested
> loops are enough of a win to risk its greater sensitivity to
> misestimation. If it is uncommon, seems we could just code the
> optimizer to use hash joins in those cases without a user-visible knob,
> beyond the knob that already turns off nested loop joins.
I think it's possible that Robert's proposal will lead to very
slightly slower plans in the vast majority of cases that are affected,
while still being a very good idea. Why should insurance be 100% free,
though? Maybe it can be in some cases where we get lucky, but why
should that be the starting point? It just has to be very cheap
relative to what we do today for us to come out ahead, certainly, but
that seems quite possible in at least this case.
--
Peter Geoghegan