Home > mailing lists

Re: disfavoring unparameterized nested loops - Mailing list pgsql-hackers

From	John Naylor
Subject	Re: disfavoring unparameterized nested loops
Date	June 21, 2021 14:27:10
Msg-id	CAFBsxsFAhMq-nKqY8e3xeTyasyNz0vUC44J84HaCJFj-7PzO6w@mail.gmail.com Whole thread Raw
In response to	Re: disfavoring unparameterized nested loops (David Rowley <dgrowleyml@gmail.com>)
Responses	Re: disfavoring unparameterized nested loops
List	pgsql-hackers

Tree view

On Tue, Jun 15, 2021 at 8:00 PM David Rowley <dgrowleyml@gmail.com> wrote:

> In my experience, the most common reason that the planner chooses
> non-parameterized nested loops wrongly is when there's row
> underestimation that says there's just going to be 1 row returned by
> some set of joins. The problem often comes when some subsequent join
> is planned and the planner sees the given join rel only produces one
> row. The cheapest join method we have to join 1 row is Nested Loop.
> So the planner just sticks the 1-row join rel on the outer side
> thinking the executor will only need to scan the inner side of the
> join once. When the outer row count blows up, then we end up scanning
> that inner side many more times. The problem is compounded when you
> nest it a few joins deep
>
> Most of the time when I see that happen it's down to either the
> selectivity of some correlated base-quals being multiplied down to a
> number low enough that we clamp the estimate to be 1 row. The other
> case is similar, but with join quals.

If an estimate is lower than 1, that should be a red flag that Something Is Wrong. This is kind of a crazy idea, but what if we threw it back the other way by computing 1 / est , and clamping that result to 2 <= res < 10 (or 100 or something)? The theory is, the more impossibly low it is, the more wrong it is. I'm attracted to the idea of dealing with it as an estimation problem and not needing to know about join types. Might have unintended consequences, though.

Long term, it would be great to calculate something about the distribution of cardinality estimates, so we can model risk in the estimates.

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Daniel Gustafsson
Date: 21 June 2021, 14:23:56
Subject: Re: SSL/TLS instead of SSL in docs

From: Greg Nancarrow
Date: 21 June 2021, 14:32:24
Subject: Re: Remove useless int64 range checks on BIGINT sequence MINVALUE/MAXVALUE values

Re: disfavoring unparameterized nested loops - Mailing list pgsql-hackers

Previous

Next