[COMMITTERS] pgsql: Fix old corner-case logic error in final_cost_nestloop(). - Mailing list pgsql-committers

From Tom Lane
Subject [COMMITTERS] pgsql: Fix old corner-case logic error in final_cost_nestloop().
Date
Msg-id E1dHD9z-0007Yg-L2@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Fix old corner-case logic error in final_cost_nestloop().

When costing a nestloop with stop-at-first-inner-match semantics, and a
non-indexscan inner path, final_cost_nestloop() wants to charge the full
scan cost of the inner rel at least once, with additional scans charged
at inner_rescan_run_cost which might be less.  However the logic for
doing this effectively assumed that outer_matched_rows is at least 1.
If it's zero, which is not unlikely for a small outer rel, we ended up
charging inner_run_cost plus N times inner_rescan_run_cost, as much as
double the correct charge for an outer rel with only one row that
we're betting won't be matched.  (Unless the inner rel is materialized,
in which case it has very small inner_rescan_run_cost and the cost
is not so far off what it should have been.)

The upshot of this was that the planner had a tendency to select plans
that failed to make effective use of the stop-at-first-inner-match
semantics, and that might have Materialize nodes in them even when the
predicted number of executions of the Materialize subplan was only 1.
This was not so obvious before commit 9c7f5229a, because the case only
arose in connection with semi/anti joins where there's not freedom to
reverse the join order.  But with the addition of unique-inner joins,
it could result in some fairly bad planning choices, as reported by
Teodor Sigaev.  Indeed, some of the test cases added by that commit
have plans that look dubious on closer inspection, and are changed
by this patch.

Fix the logic to ensure that we don't charge for too many inner scans.
I chose to adjust it so that the full-freight scan cost is associated
with an unmatched outer row if possible, not a matched one, since that
seems like a better model of what would happen at runtime.

This is a longstanding bug, but given the lesser impact in back branches,
and the lack of field complaints, I won't risk a back-patch.

Discussion: https://postgr.es/m/CAKJS1f-LzkUsFxdJ_-Luy38orQ+AdEXM5o+vANR+-pHAWPSecg@mail.gmail.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/23886581b58c7e5d005d6346c0024a45801cc5a9

Modified Files
--------------
src/backend/optimizer/path/costsize.c | 30 ++++++++++++++-------
src/test/regress/expected/join.out    | 50 +++++++++++++++--------------------
2 files changed, 42 insertions(+), 38 deletions(-)


pgsql-committers by date:

Previous
From: Peter Eisentraut
Date:
Subject: [COMMITTERS] pgsql: Receive invalidation messages correctly in tablesync worker
Next
From: Tom Lane
Date:
Subject: [COMMITTERS] pgsql: Fix <> and pattern-NOT-match estimators to handle nullscorrectl