pgsql: Adjust definition of cheapest_total_path to work better with LAT - Mailing list pgsql-committers

From Tom Lane
Subject pgsql: Adjust definition of cheapest_total_path to work better with LAT
Date
Msg-id E1T6u9I-0000cn-Vj@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
Adjust definition of cheapest_total_path to work better with LATERAL.

In the initial cut at LATERAL, I kept the rule that cheapest_total_path
was always unparameterized, which meant it had to be NULL if the relation
has no unparameterized paths.  It turns out to work much more nicely if
we always have *some* path nominated as cheapest-total for each relation.
In particular, let's still say it's the cheapest unparameterized path if
there is one; if not, take the cheapest-total-cost path among those of
the minimum available parameterization.  (The first rule is actually
a special case of the second.)

This allows reversion of some temporary lobotomizations I'd put in place.
In particular, the planner can now consider hash and merge joins for
joins below a parameter-supplying nestloop, even if there aren't any
unparameterized paths available.  This should bring planning of
LATERAL-containing queries to the same level as queries not using that
feature.

Along the way, simplify management of parameterized paths in add_path()
and friends.  In the original coding for parameterized paths in 9.2,
I tried to minimize the logic changes in add_path(), so it just treated
parameterization as yet another dimension of comparison for paths.
We later made it ignore pathkeys (sort ordering) of parameterized paths,
on the grounds that ordering isn't a useful property for the path on the
inside of a nestloop, so we might as well get rid of useless parameterized
paths as quickly as possible.  But we didn't take that reasoning as far as
we should have.  Startup cost isn't a useful property inside a nestloop
either, so add_path() ought to discount startup cost of parameterized paths
as well.  Having done that, the secondary sorting I'd implemented (in
add_parameterized_path) is no longer needed --- any parameterized path that
survives add_path() at all is worth considering at higher levels.  So this
should be a bit faster as well as simpler.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/e83bb10d6dcf05a666d4ada00d9788c7974ad378

Modified Files
--------------
src/backend/optimizer/README           |   17 ++-
src/backend/optimizer/geqo/geqo_eval.c |   10 +-
src/backend/optimizer/path/allpaths.c  |    3 +-
src/backend/optimizer/path/joinpath.c  |   85 +++++----
src/backend/optimizer/plan/planmain.c  |    3 +-
src/backend/optimizer/util/pathnode.c  |  320 +++++++++++++-------------------
src/include/nodes/relation.h           |   18 +-
src/test/regress/expected/join.out     |   13 +-
8 files changed, 217 insertions(+), 252 deletions(-)


pgsql-committers by date:

Previous
From: Bruce Momjian
Date:
Subject: pgsql: Document that NOTIFY events are visible to all users.
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Also check for Python platform-specific include directory