Re: disfavoring unparameterized nested loops - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: disfavoring unparameterized nested loops
Date
Msg-id CAH2-Wzm=GRgofq2piu7c6mktm1BPbVe307CZwX-CWXt4tM+2qw@mail.gmail.com
Whole thread Raw
In response to disfavoring unparameterized nested loops  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: disfavoring unparameterized nested loops
List pgsql-hackers
On Tue, Jun 15, 2021 at 10:09 AM Robert Haas <robertmhaas@gmail.com> wrote:
> How to do that is not very clear. One very simple thing we could do
> would be to introduce enable_nestloop=parameterized or
> enable_parameterized_nestloop=false. That is a pretty blunt instrument
> but the authors of the paper seem to have done that with positive
> results, so maybe it's actually not a dumb idea.

I think that it's probably a good idea as-is.

Simple heuristics that are very frequently wrong when considered in a
naive way can work very well in practice. This seems to happen when
they capture some kind of extreme naturally occuring cost/benefit
asymmetry -- especially one with fixed well understood costs and
unlimited benefits (this business with unparameterized nestloop joins
is about *avoiding* the inverse asymmetry, but that seems very
similar). My go to example of such an asymmetry is the rightmost page
split heuristic of applying leaf fillfactor regardless of any of the
other specifics; we effectively assume that all indexes are on columns
with ever-increasing values. Which is obviously wrong.

We're choosing between two alternatives (unparameterized nested loop
vs hash join) that are really very similar when things go as expected,
but diverge sharply when there is a misestimation -- who wouldn't take
the "conservative" choice here?

I guess that there is a hesitation to not introduce heuristics like
this because it doesn't fit into some larger framework that captures
risk -- it might be seen as an ugly special case. But isn't this
already actually kind of special, whether or not we officially think
so?

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: snapshot too old issues, first around wraparound and then more.
Next
From: John Naylor
Date:
Subject: Re: a path towards replacing GEQO with something better