Re: disfavoring unparameterized nested loops - Mailing list pgsql-hackers

From Tom Lane
Subject Re: disfavoring unparameterized nested loops
Date
Msg-id 1650873.1624294359@sss.pgh.pa.us
Whole thread Raw
In response to Re: disfavoring unparameterized nested loops  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: disfavoring unparameterized nested loops  (Peter Geoghegan <pg@bowt.ie>)
Re: disfavoring unparameterized nested loops  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Peter Geoghegan <pg@bowt.ie> writes:
> On Mon, Jun 21, 2021 at 8:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'd be a lot happier if this proposal were couched around some sort
>> of estimate of the risk of the outer side producing more than the
>> expected number of rows.  The arguments so far seem like fairly lame
>> rationalizations for not putting forth the effort to do that.

> I'm not so sure that it is. The point isn't the risk, even if it could
> be calculated. The point is the downsides of being wrong are huge and
> pretty much unbounded, whereas the benefits of being right are tiny
> and bounded. It almost doesn't matter what the underlying
> probabilities are.

You're throwing around a lot of pejorative adjectives there without
having bothered to quantify any of them.  This sounds less like a sound
argument to me than like a witch trial.

Reflecting for a bit on the ancient principle that "the only good numbers
in computer science are 0, 1, and N", it seems to me that we could make
an effort to classify RelOptInfos as provably empty, provably at most one
row, and others.  (This would be a property of relations, not individual
paths, so it needn't bloat struct Path.)  We already understand about
provably-empty rels, so this is just generalizing that idea a little.
Once we know about that, at least for the really-common cases like unique
keys, I'd be okay with a hard rule that we don't consider unparameterized
nestloop joins with an outer side that falls in the "N" category.
Unless there's no alternative, of course.

Another thought that occurs to me here is that maybe we could get rid of
the enable_material knob in favor of forcing (or at least encouraging)
materialization when the outer side isn't provably one row.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Discarding DISCARD ALL
Next
From: Tom Lane
Date:
Subject: Re: disfavoring unparameterized nested loops