Re: disfavoring unparameterized nested loops - Mailing list pgsql-hackers

From David Rowley
Subject Re: disfavoring unparameterized nested loops
Date
Msg-id CAApHDvrgc15XGXZSWU0BHGNq6MT-j+ZR5De1ygcKk0F5y00vrg@mail.gmail.com
Whole thread Raw
In response to Re: disfavoring unparameterized nested loops  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: disfavoring unparameterized nested loops  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, 16 Jun 2021 at 15:08, Peter Geoghegan <pg@bowt.ie> wrote:
> It seems important to distinguish between risk and uncertainty --
> they're rather different things. The short version is that you can
> model risk but you cannot model uncertainty. This seems like a problem
> of uncertainty to me.

You might be right there.  "Uncertainty" or "Certainty" seems more
like a value that clauselist_selectivity() would be able to calculate
itself. It would just be up to the planner to determine what to do
with it.

One thing I thought about is that if the costing modal was able to
separate out a cost of additional (unexpected) rows then it would be
easier for add_path() to take into account how bad things might go if
we underestimate.

For example, in an unparameterized Nested Loop that estimates the
outer Path to have 1 row will cost an entire additional inner cost if
there are 2 rows.  With Hash Join the cost is just an additional
hashtable lookup, which is dead cheap.   I don't know exactly how
add_path() would weigh all that up, but it seems to me that I wouldn't
take the risk unless I was 100% certain that the Nested Loop's outer
Path would only return 1 row exactly, if there was any chance at all
it could return more, I'd be picking some other join method.

David



pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Re: Fix for segfault in logical replication on master
Next
From: Brar Piening
Date:
Subject: Re: Doc patch for Logical Replication Message Formats (PG14)