Re: plan shape work - Mailing list pgsql-hackers

From Tom Lane
Subject Re: plan shape work
Date
Msg-id 3200728.1758662857@sss.pgh.pa.us
Whole thread Raw
In response to Re: plan shape work  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: plan shape work
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Sep 22, 2025 at 2:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In the second place, we should not need to add two hundred lines
>> of new code to createplan.c to accomplish this.  Why not simply
>> bms_difference the joinrel's relids from the union of the inputs'
>> relids?

> ... On the other hand, I'm not sure that I'm interpreting your remarks
> correctly. When you say "bms_difference the joinrel's relids from the
> union of the inputs' relids" maybe you're specifically talking about
> the handling of the RTE_JOIN relids, and I don't care very much how we
> account for those. So I guess I need some clarification here as to
> what your thinking is.

What I'm saying is that I'd be much happier with 0003 if it looked
about like the attached.  We do not need a heap of mechanism
redundantly proving that the planner is getting these things right
(and potentially containing its own bugs).

> Note that my goal for
> this commitfest was to get 0001-0004 committed, partly because I
> wasn't too sure whether the later patches might need some adjustment.

Fair enough.  I think we can reach agreement on that much pretty quickly.

            regards, tom lane

From 1a3a6162691105ac23522f66b54dba42850e993e Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Tue, 23 Sep 2025 17:18:33 -0400
Subject: [PATCH v8] Ensure that all joinrel RTIs are discoverable from join
 plans.

Every RTI associated with a joinrel appears either on the outer or inner
side of the joinrel or is an outer join completed by the joinrel.
Previously, the RTIs of outer joins cmopleted by the joinrel were not
stored anywhere; now, we store them in a new 'ojrelids' field of the
Join itself, for the benefit of code that wants to study Plan trees.

All of this is intended as infrastructure to make it possible to
reliably determine the chosen join order from the final plan, although
it's not sufficient for that goal of itself, due to further problems
created by setrefs-time processing.
---
 .../expected/pg_overexplain.out               | 40 ++++++++++++++++++-
 contrib/pg_overexplain/pg_overexplain.c       | 21 ++++++++++
 contrib/pg_overexplain/sql/pg_overexplain.sql | 14 ++++++-
 src/backend/optimizer/plan/createplan.c       | 39 ++++++++++++++++--
 src/include/nodes/plannodes.h                 |  2 +
 5 files changed, 109 insertions(+), 7 deletions(-)

diff --git a/contrib/pg_overexplain/expected/pg_overexplain.out b/contrib/pg_overexplain/expected/pg_overexplain.out
index 55d34666d87..57c997e8b32 100644
--- a/contrib/pg_overexplain/expected/pg_overexplain.out
+++ b/contrib/pg_overexplain/expected/pg_overexplain.out
@@ -377,14 +377,15 @@ $$);
 (15 rows)

 -- Create an index, and then attempt to force a nested loop with inner index
--- scan so that we can see parameter-related information. Also, let's try
--- actually running the query, but try to suppress potentially variable output.
+-- scan so that we can see parameter-related information.
 CREATE INDEX ON vegetables (id);
 ANALYZE vegetables;
 SET enable_hashjoin = false;
 SET enable_material = false;
 SET enable_mergejoin = false;
 SET enable_seqscan = false;
+-- Let's try actually running the query, but try to suppress potentially
+-- variable output.
 SELECT explain_filter($$
 EXPLAIN (BUFFERS OFF, COSTS OFF, SUMMARY OFF, TIMING OFF, ANALYZE, DEBUG)
 SELECT * FROM vegetables v1, vegetables v2 WHERE v1.id = v2.id;
@@ -440,6 +441,41 @@ $$);
    Parse Location: 0 to end
 (47 rows)

+-- Test the RANGE_TABLE otion with a case that involves an outer join.
+SELECT explain_filter($$
+EXPLAIN (RANGE_TABLE, COSTS OFF)
+SELECT * FROM daucus d LEFT JOIN brassica b ON d.id = b.id;
+$$);
+                     explain_filter
+---------------------------------------------------------
+ Nested Loop Left Join
+   Outer Join RTIs: 3
+   ->  Index Scan using daucus_id_idx on daucus d
+         Scan RTI: 1
+   ->  Index Scan using brassica_id_idx on brassica b
+         Index Cond: (id = d.id)
+         Scan RTI: 2
+ RTI 1 (relation, in-from-clause):
+   Alias: d ()
+   Eref: d (id, name, genus)
+   Relation: daucus
+   Relation Kind: relation
+   Relation Lock Mode: AccessShareLock
+   Permission Info Index: 1
+ RTI 2 (relation, in-from-clause):
+   Alias: b ()
+   Eref: b (id, name, genus)
+   Relation: brassica
+   Relation Kind: relation
+   Relation Lock Mode: AccessShareLock
+   Permission Info Index: 2
+ RTI 3 (join, in-from-clause):
+   Eref: unnamed_join (id, name, genus, id, name, genus)
+   Join Type: Left
+ Unprunable RTIs: 1 2
+(25 rows)
+
+-- Restore default settings.
 RESET enable_hashjoin;
 RESET enable_material;
 RESET enable_mergejoin;
diff --git a/contrib/pg_overexplain/pg_overexplain.c b/contrib/pg_overexplain/pg_overexplain.c
index bd70b6d9d5e..92cfd8af2eb 100644
--- a/contrib/pg_overexplain/pg_overexplain.c
+++ b/contrib/pg_overexplain/pg_overexplain.c
@@ -248,6 +248,27 @@ overexplain_per_node_hook(PlanState *planstate, List *ancestors,
                     overexplain_bitmapset("RTIs",
                                           ((Result *) plan)->relids,
                                           es);
+                break;
+
+            case T_MergeJoin:
+            case T_NestLoop:
+            case T_HashJoin:
+                {
+                    Join       *join = (Join *) plan;
+
+                    /*
+                     * 'ojrelids' is only meaningful for non-inner joins, but
+                     * if it somehow ends up set for an inner join, print it
+                     * anyway.
+                     */
+                    if (join->jointype != JOIN_INNER ||
+                        join->ojrelids != NULL)
+                        overexplain_bitmapset("Outer Join RTIs",
+                                              join->ojrelids,
+                                              es);
+                    break;
+                }
+
             default:
                 break;
         }
diff --git a/contrib/pg_overexplain/sql/pg_overexplain.sql b/contrib/pg_overexplain/sql/pg_overexplain.sql
index 42e275ac2f9..53aa9ff788e 100644
--- a/contrib/pg_overexplain/sql/pg_overexplain.sql
+++ b/contrib/pg_overexplain/sql/pg_overexplain.sql
@@ -86,18 +86,28 @@ INSERT INTO vegetables (name, genus)
 $$);

 -- Create an index, and then attempt to force a nested loop with inner index
--- scan so that we can see parameter-related information. Also, let's try
--- actually running the query, but try to suppress potentially variable output.
+-- scan so that we can see parameter-related information.
 CREATE INDEX ON vegetables (id);
 ANALYZE vegetables;
 SET enable_hashjoin = false;
 SET enable_material = false;
 SET enable_mergejoin = false;
 SET enable_seqscan = false;
+
+-- Let's try actually running the query, but try to suppress potentially
+-- variable output.
 SELECT explain_filter($$
 EXPLAIN (BUFFERS OFF, COSTS OFF, SUMMARY OFF, TIMING OFF, ANALYZE, DEBUG)
 SELECT * FROM vegetables v1, vegetables v2 WHERE v1.id = v2.id;
 $$);
+
+-- Test the RANGE_TABLE otion with a case that involves an outer join.
+SELECT explain_filter($$
+EXPLAIN (RANGE_TABLE, COSTS OFF)
+SELECT * FROM daucus d LEFT JOIN brassica b ON d.id = b.id;
+$$);
+
+-- Restore default settings.
 RESET enable_hashjoin;
 RESET enable_material;
 RESET enable_mergejoin;
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index c9dba7ff346..e6bb16ff7c0 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -232,14 +232,18 @@ static BitmapOr *make_bitmap_or(List *bitmapplans);
 static NestLoop *make_nestloop(List *tlist,
                                List *joinclauses, List *otherclauses, List *nestParams,
                                Plan *lefttree, Plan *righttree,
-                               JoinType jointype, bool inner_unique);
+                               JoinType jointype,
+                               Relids ojrelids,
+                               bool inner_unique);
 static HashJoin *make_hashjoin(List *tlist,
                                List *joinclauses, List *otherclauses,
                                List *hashclauses,
                                List *hashoperators, List *hashcollations,
                                List *hashkeys,
                                Plan *lefttree, Plan *righttree,
-                               JoinType jointype, bool inner_unique);
+                               JoinType jointype,
+                               Relids ojrelids,
+                               bool inner_unique);
 static Hash *make_hash(Plan *lefttree,
                        List *hashkeys,
                        Oid skewTable,
@@ -253,7 +257,9 @@ static MergeJoin *make_mergejoin(List *tlist,
                                  bool *mergereversals,
                                  bool *mergenullsfirst,
                                  Plan *lefttree, Plan *righttree,
-                                 JoinType jointype, bool inner_unique,
+                                 JoinType jointype,
+                                 Relids ojrelids,
+                                 bool inner_unique,
                                  bool skip_mark_restore);
 static Sort *make_sort(Plan *lefttree, int numCols,
                        AttrNumber *sortColIdx, Oid *sortOperators,
@@ -4199,6 +4205,7 @@ create_nestloop_plan(PlannerInfo *root,
     Plan       *outer_plan;
     Plan       *inner_plan;
     Relids        outerrelids;
+    Relids        ojrelids;
     List       *tlist = build_path_tlist(root, &best_path->jpath.path);
     List       *joinrestrictclauses = best_path->jpath.joinrestrictinfo;
     List       *joinclauses;
@@ -4265,6 +4272,11 @@ create_nestloop_plan(PlannerInfo *root,
             replace_nestloop_params(root, (Node *) otherclauses);
     }

+    /* Identify any outer joins computed at this level */
+    ojrelids = bms_difference(best_path->jpath.path.parent->relids,
+                              bms_union(best_path->jpath.outerjoinpath->parent->relids,
+                                        best_path->jpath.innerjoinpath->parent->relids));
+
     /*
      * Identify any nestloop parameters that should be supplied by this join
      * node, and remove them from root->curOuterParams.
@@ -4336,6 +4348,7 @@ create_nestloop_plan(PlannerInfo *root,
                               outer_plan,
                               inner_plan,
                               best_path->jpath.jointype,
+                              ojrelids,
                               best_path->jpath.inner_unique);

     copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
@@ -4350,6 +4363,7 @@ create_mergejoin_plan(PlannerInfo *root,
     MergeJoin  *join_plan;
     Plan       *outer_plan;
     Plan       *inner_plan;
+    Relids        ojrelids;
     List       *tlist = build_path_tlist(root, &best_path->jpath.path);
     List       *joinclauses;
     List       *otherclauses;
@@ -4428,6 +4442,11 @@ create_mergejoin_plan(PlannerInfo *root,
     mergeclauses = get_switched_clauses(best_path->path_mergeclauses,
                                         best_path->jpath.outerjoinpath->parent->relids);

+    /* Identify any outer joins computed at this level */
+    ojrelids = bms_difference(best_path->jpath.path.parent->relids,
+                              bms_union(outer_path->parent->relids,
+                                        inner_path->parent->relids));
+
     /*
      * Create explicit sort nodes for the outer and inner paths if necessary.
      */
@@ -4688,6 +4707,7 @@ create_mergejoin_plan(PlannerInfo *root,
                                outer_plan,
                                inner_plan,
                                best_path->jpath.jointype,
+                               ojrelids,
                                best_path->jpath.inner_unique,
                                best_path->skip_mark_restore);

@@ -4705,6 +4725,7 @@ create_hashjoin_plan(PlannerInfo *root,
     Hash       *hash_plan;
     Plan       *outer_plan;
     Plan       *inner_plan;
+    Relids        ojrelids;
     List       *tlist = build_path_tlist(root, &best_path->jpath.path);
     List       *joinclauses;
     List       *otherclauses;
@@ -4853,6 +4874,11 @@ create_hashjoin_plan(PlannerInfo *root,
         hash_plan->rows_total = best_path->inner_rows_total;
     }

+    /* Identify any outer joins computed at this level */
+    ojrelids = bms_difference(best_path->jpath.path.parent->relids,
+                              bms_union(best_path->jpath.outerjoinpath->parent->relids,
+                                        best_path->jpath.innerjoinpath->parent->relids));
+
     join_plan = make_hashjoin(tlist,
                               joinclauses,
                               otherclauses,
@@ -4863,6 +4889,7 @@ create_hashjoin_plan(PlannerInfo *root,
                               outer_plan,
                               (Plan *) hash_plan,
                               best_path->jpath.jointype,
+                              ojrelids,
                               best_path->jpath.inner_unique);

     copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
@@ -5935,6 +5962,7 @@ make_nestloop(List *tlist,
               Plan *lefttree,
               Plan *righttree,
               JoinType jointype,
+              Relids ojrelids,
               bool inner_unique)
 {
     NestLoop   *node = makeNode(NestLoop);
@@ -5947,6 +5975,7 @@ make_nestloop(List *tlist,
     node->join.jointype = jointype;
     node->join.inner_unique = inner_unique;
     node->join.joinqual = joinclauses;
+    node->join.ojrelids = ojrelids;
     node->nestParams = nestParams;

     return node;
@@ -5963,6 +5992,7 @@ make_hashjoin(List *tlist,
               Plan *lefttree,
               Plan *righttree,
               JoinType jointype,
+              Relids ojrelids,
               bool inner_unique)
 {
     HashJoin   *node = makeNode(HashJoin);
@@ -5979,6 +6009,7 @@ make_hashjoin(List *tlist,
     node->join.jointype = jointype;
     node->join.inner_unique = inner_unique;
     node->join.joinqual = joinclauses;
+    node->join.ojrelids = ojrelids;

     return node;
 }
@@ -6018,6 +6049,7 @@ make_mergejoin(List *tlist,
                Plan *lefttree,
                Plan *righttree,
                JoinType jointype,
+               Relids ojrelids,
                bool inner_unique,
                bool skip_mark_restore)
 {
@@ -6037,6 +6069,7 @@ make_mergejoin(List *tlist,
     node->join.jointype = jointype;
     node->join.inner_unique = inner_unique;
     node->join.joinqual = joinclauses;
+    node->join.ojrelids = ojrelids;

     return node;
 }
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 3d196f5078e..16f3f5a7925 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -938,6 +938,7 @@ typedef struct CustomScan
  * inner_unique each outer tuple can match to no more than one inner tuple
  * joinqual:    qual conditions that came from JOIN/ON or JOIN/USING
  *                (plan.qual contains conditions that came from WHERE)
+ * ojrelids:    outer joins completed at this level
  *
  * When jointype is INNER, joinqual and plan.qual are semantically
  * interchangeable.  For OUTER jointypes, the two are *not* interchangeable;
@@ -962,6 +963,7 @@ typedef struct Join
     bool        inner_unique;
     /* JOIN quals (in addition to plan.qual) */
     List       *joinqual;
+    Bitmapset  *ojrelids;
 } Join;

 /* ----------------
--
2.43.7


pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: libcurl in libpq.pc
Next
From: Jacob Champion
Date:
Subject: Re: libcurl in libpq.pc