Re: Oversight in reparameterize_path_by_child leading to executor crash - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Oversight in reparameterize_path_by_child leading to executor crash
Date
Msg-id 2962252.1692650886@sss.pgh.pa.us
Whole thread Raw
In response to Re: Oversight in reparameterize_path_by_child leading to executor crash  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Oversight in reparameterize_path_by_child leading to executor crash
Re: Oversight in reparameterize_path_by_child leading to executor crash
List pgsql-hackers
I spent some time reviewing the v4 patch.  I noted that
path_is_reparameterizable_by_child still wasn't modeling the pass/fail
behavior of reparameterize_path_by_child very well, because that
function checks this at every recursion level:

    /*
     * If the path is not parameterized by the parent of the given relation,
     * it doesn't need reparameterization.
     */
    if (!path->param_info ||
        !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
        return path;

So we might have a sub-path (eg a join input rel) that is not one of
the supported kinds, and yet we can succeed because parameterization
appears only in other sub-paths.  The patch as written will not crash
in such a case, but it might refuse to use a path that we previously
would have allowed.  So I think we need to put the same test into
path_is_reparameterizable_by_child, which requires adding child_rel
to its param list but is otherwise simple enough.

I also realized that this test is equivalent to 
PATH_PARAM_BY_PARENT(), which makes it really unnecessary for
createplan.c to test PATH_PARAM_BY_PARENT, so we don't need to
expose those macros globally after all.

(On the same logic, we could skip PATH_PARAM_BY_PARENT at the
call sites of path_is_reparameterizable_by_child.  I didn't
do that in the attached, mainly because it seems to make it
harder to understand/explain what is being tested.)

Another change I made is to put the path_is_reparameterizable_by_child
tests before the initial_cost_nestloop/add_path_precheck steps, on
the grounds that they're now cheap enough that we might as well do
them first.  The existing ordering of these steps was sensible when
we were doing the expensive reparameterization, but it seems a bit
unnatural IMO.

Lastly, although I'd asked for a test case demonstrating detection of
an unparameterizable sub-path, I ended up not using that, because
it seemed pretty fragile.  If somebody decides that
reparameterize_path_by_child ought to cover TidPaths, the test won't
prove anything any more.

So that led me to the attached v5, which seemed committable to me so I
set about back-patching it ... and it fell over immediately in v15, as
shown in the attached regression diffs from v15.  It looks to me like
we are now failing to recognize that reparameterized quals are
redundant with not-reparameterized ones, so this patch is somehow
dependent on restructuring that happened during the v16 cycle.
I don't have time to dig deeper than that, and I'm not sure that that
is an area we'd want to mess with in a back-patched bug fix.

What I'm thinking we ought to do instead for the back branches
is just refuse to generate a reparameterized path for tablesample
scans.  A minimal fix like that could be as little as

        case T_Path:
+           if (path->pathtype == T_SampleScan)
+               return NULL;
            FLAT_COPY_PATH(new_path, path, Path);
            break;

This rejects more than it absolutely has to, because the
parameterization (that we know exists) might be in the path's
regular quals or tlist rather than in the tablesample.
So we could add something to see if the tablesample is
parameter-free, but I'm quite unsure that it's worth the
trouble.  There must be just about nobody using such cases,
or we'd have heard of this bug long ago.

(BTW, I did look at Ashutosh's idea of merging the
reparameterize_path_by_child and path_is_reparameterizable_by_child
functions, but I didn't think that would be an improvement,
because we'd have to clutter reparameterize_path_by_child with
a lot of should-I-skip-this-step tests.  Some of that could be
hidden in the macros, but a lot would not be.  Another issue
is that I do not think we can change reparameterize_path_by_child's
API contract in the back branches, because we advertise it as
something that FDWs and custom scan providers can use.)

Thoughts?

            regards, tom lane

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 821d282497..e2ecf5b14b 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -30,8 +30,9 @@
 set_join_pathlist_hook_type set_join_pathlist_hook = NULL;

 /*
- * Paths parameterized by the parent can be considered to be parameterized by
- * any of its child.
+ * Paths parameterized by a parent rel can be considered to be parameterized
+ * by any of its children, when we are performing partitionwise joins.  These
+ * macros simplify checking for such cases.  Beware multiple eval of args.
  */
 #define PATH_PARAM_BY_PARENT(path, rel)    \
     ((path)->param_info && bms_overlap(PATH_REQ_OUTER(path),    \
@@ -762,6 +763,20 @@ try_nestloop_path(PlannerInfo *root,
     /* If we got past that, we shouldn't have any unsafe outer-join refs */
     Assert(!have_unsafe_outer_join_ref(root, outerrelids, inner_paramrels));

+    /*
+     * If the inner path is parameterized, it is parameterized by the topmost
+     * parent of the outer rel, not the outer rel itself.  We will need to
+     * translate the parameterization, if this path is chosen, during
+     * create_plan().  Here we just check whether we will be able to perform
+     * the translation, and if not avoid creating a nestloop path.
+     */
+    if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent) &&
+        !path_is_reparameterizable_by_child(inner_path, outer_path->parent))
+    {
+        bms_free(required_outer);
+        return;
+    }
+
     /*
      * Do a precheck to quickly eliminate obviously-inferior paths.  We
      * calculate a cheap lower bound on the path's cost and then use
@@ -778,27 +793,6 @@ try_nestloop_path(PlannerInfo *root,
                           workspace.startup_cost, workspace.total_cost,
                           pathkeys, required_outer))
     {
-        /*
-         * If the inner path is parameterized, it is parameterized by the
-         * topmost parent of the outer rel, not the outer rel itself.  Fix
-         * that.
-         */
-        if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent))
-        {
-            inner_path = reparameterize_path_by_child(root, inner_path,
-                                                      outer_path->parent);
-
-            /*
-             * If we could not translate the path, we can't create nest loop
-             * path.
-             */
-            if (!inner_path)
-            {
-                bms_free(required_outer);
-                return;
-            }
-        }
-
         add_path(joinrel, (Path *)
                  create_nestloop_path(root,
                                       joinrel,
@@ -861,6 +855,17 @@ try_partial_nestloop_path(PlannerInfo *root,
             return;
     }

+    /*
+     * If the inner path is parameterized, it is parameterized by the topmost
+     * parent of the outer rel, not the outer rel itself.  We will need to
+     * translate the parameterization, if this path is chosen, during
+     * create_plan().  Here we just check whether we will be able to perform
+     * the translation, and if not avoid creating a nestloop path.
+     */
+    if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent) &&
+        !path_is_reparameterizable_by_child(inner_path, outer_path->parent))
+        return;
+
     /*
      * Before creating a path, get a quick lower bound on what it is likely to
      * cost.  Bail out right away if it looks terrible.
@@ -870,22 +875,6 @@ try_partial_nestloop_path(PlannerInfo *root,
     if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
         return;

-    /*
-     * If the inner path is parameterized, it is parameterized by the topmost
-     * parent of the outer rel, not the outer rel itself.  Fix that.
-     */
-    if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent))
-    {
-        inner_path = reparameterize_path_by_child(root, inner_path,
-                                                  outer_path->parent);
-
-        /*
-         * If we could not translate the path, we can't create nest loop path.
-         */
-        if (!inner_path)
-            return;
-    }
-
     /* Might be good enough to be worth trying, so let's try it. */
     add_partial_path(joinrel, (Path *)
                      create_nestloop_path(root,
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 34ca6d4ac2..38bd179f4f 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -29,6 +29,7 @@
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
@@ -4346,6 +4347,22 @@ create_nestloop_plan(PlannerInfo *root,
     List       *nestParams;
     Relids        saveOuterRels = root->curOuterRels;

+    /*
+     * If the inner path is parameterized by the topmost parent of the outer
+     * rel rather than the outer rel itself, fix that.  (Nothing happens here
+     * if it is not so parameterized.)
+     */
+    best_path->jpath.innerjoinpath =
+        reparameterize_path_by_child(root,
+                                     best_path->jpath.innerjoinpath,
+                                     best_path->jpath.outerjoinpath->parent);
+
+    /*
+     * Failure here probably means that reparameterize_path_by_child() is not
+     * in sync with path_is_reparameterizable_by_child().
+     */
+    Assert(best_path->jpath.innerjoinpath != NULL);
+
     /* NestLoop can project, so no need to be picky about child tlists */
     outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, 0);

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 211ba65389..21d002243c 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -56,6 +56,8 @@ static int    append_startup_cost_compare(const ListCell *a, const ListCell *b);
 static List *reparameterize_pathlist_by_child(PlannerInfo *root,
                                               List *pathlist,
                                               RelOptInfo *child_rel);
+static bool pathlist_is_reparameterizable_by_child(List *pathlist,
+                                                   RelOptInfo *child_rel);


 /*****************************************************************************
@@ -2436,6 +2438,16 @@ create_nestloop_path(PlannerInfo *root,
 {
     NestPath   *pathnode = makeNode(NestPath);
     Relids        inner_req_outer = PATH_REQ_OUTER(inner_path);
+    Relids        outerrelids;
+
+    /*
+     * Paths are parameterized by top-level parents, so run parameterization
+     * tests on the parent relids.
+     */
+    if (outer_path->parent->top_parent_relids)
+        outerrelids = outer_path->parent->top_parent_relids;
+    else
+        outerrelids = outer_path->parent->relids;

     /*
      * If the inner path is parameterized by the outer, we must drop any
@@ -2445,7 +2457,7 @@ create_nestloop_path(PlannerInfo *root,
      * estimates for this path.  We detect such clauses by checking for serial
      * number match to clauses already enforced in the inner path.
      */
-    if (bms_overlap(inner_req_outer, outer_path->parent->relids))
+    if (bms_overlap(inner_req_outer, outerrelids))
     {
         Bitmapset  *enforced_serials = get_param_path_clause_serials(inner_path);
         List       *jclauses = NIL;
@@ -4042,6 +4054,12 @@ reparameterize_path(PlannerInfo *root, Path *path,
  *
  * Currently, only a few path types are supported here, though more could be
  * added at need.  We return NULL if we can't reparameterize the given path.
+ *
+ * Note that this function can change referenced RTEs as well as the Path
+ * structures.  Therefore, it's only safe to call during create_plan(),
+ * when we have made a final choice of which Path to use for each RTE.
+ *
+ * Keep this code in sync with path_is_reparameterizable_by_child()!
  */
 Path *
 reparameterize_path_by_child(PlannerInfo *root, Path *path,
@@ -4054,7 +4072,7 @@ reparameterize_path_by_child(PlannerInfo *root, Path *path,

 #define ADJUST_CHILD_ATTRS(node) \
     ((node) = \
-     (List *) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
+     (void *) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
                                                 child_rel, \
                                                 child_rel->top_parent))

@@ -4082,8 +4100,8 @@ do { \
     Relids        required_outer;

     /*
-     * If the path is not parameterized by parent of the given relation, it
-     * doesn't need reparameterization.
+     * If the path is not parameterized by the parent of the given relation,
+     * it doesn't need reparameterization.
      */
     if (!path->param_info ||
         !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
@@ -4105,6 +4123,19 @@ do { \
     {
         case T_Path:
             FLAT_COPY_PATH(new_path, path, Path);
+            if (path->pathtype == T_SampleScan)
+            {
+                Index        scan_relid = path->parent->relid;
+                RangeTblEntry *rte;
+
+                /* it should be a base rel with a tablesample clause... */
+                Assert(scan_relid > 0);
+                rte = planner_rt_fetch(scan_relid, root);
+                Assert(rte->rtekind == RTE_RELATION);
+                Assert(rte->tablesample != NULL);
+
+                ADJUST_CHILD_ATTRS(rte->tablesample);
+            }
             break;

         case T_IndexPath:
@@ -4335,9 +4366,145 @@ do { \
     return new_path;
 }

+/*
+ * path_is_reparameterizable_by_child
+ *         Given a path parameterized by the parent of the given child relation,
+ *         see if it can be translated to be parameterized by the child relation.
+ *
+ * This must return true if and only if reparameterize_path_by_child()
+ * would succeed on this path.  Currently it's sufficient to verify that
+ * the path and all of its subpaths (if any) are of the types handled by
+ * that function.  However, sub-paths that are not parameterized can be
+ * disregarded since they won't require translation.
+ */
+bool
+path_is_reparameterizable_by_child(Path *path, RelOptInfo *child_rel)
+{
+
+#define CHILD_PATH_IS_REPARAMETERIZABLE(path) \
+do { \
+    if (!path_is_reparameterizable_by_child(path, child_rel)) \
+        return false; \
+} while(0)
+
+#define CHILD_PATH_LIST_IS_REPARAMETERIZABLE(pathlist) \
+do { \
+    if (!pathlist_is_reparameterizable_by_child(pathlist, child_rel)) \
+        return false; \
+} while(0)
+
+    /*
+     * If the path is not parameterized by the parent of the given relation,
+     * it doesn't need reparameterization.
+     */
+    if (!path->param_info ||
+        !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
+        return true;
+
+    switch (nodeTag(path))
+    {
+        case T_Path:
+        case T_IndexPath:
+            break;
+
+        case T_BitmapHeapPath:
+            {
+                BitmapHeapPath *bhpath = (BitmapHeapPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(bhpath->bitmapqual);
+            }
+            break;
+
+        case T_BitmapAndPath:
+            {
+                BitmapAndPath *bapath = (BitmapAndPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(bapath->bitmapquals);
+            }
+            break;
+
+        case T_BitmapOrPath:
+            {
+                BitmapOrPath *bopath = (BitmapOrPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(bopath->bitmapquals);
+            }
+            break;
+
+        case T_ForeignPath:
+            {
+                ForeignPath *fpath = (ForeignPath *) path;
+
+                if (fpath->fdw_outerpath)
+                    CHILD_PATH_IS_REPARAMETERIZABLE(fpath->fdw_outerpath);
+            }
+            break;
+
+        case T_CustomPath:
+            {
+                CustomPath *cpath = (CustomPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(cpath->custom_paths);
+            }
+            break;
+
+        case T_NestPath:
+        case T_MergePath:
+        case T_HashPath:
+            {
+                JoinPath   *jpath = (JoinPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(jpath->outerjoinpath);
+                CHILD_PATH_IS_REPARAMETERIZABLE(jpath->innerjoinpath);
+            }
+            break;
+
+        case T_AppendPath:
+            {
+                AppendPath *apath = (AppendPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(apath->subpaths);
+            }
+            break;
+
+        case T_MaterialPath:
+            {
+                MaterialPath *mpath = (MaterialPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(mpath->subpath);
+            }
+            break;
+
+        case T_MemoizePath:
+            {
+                MemoizePath *mpath = (MemoizePath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(mpath->subpath);
+            }
+            break;
+
+        case T_GatherPath:
+            {
+                GatherPath *gpath = (GatherPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(gpath->subpath);
+            }
+            break;
+
+        default:
+
+            /* We don't know how to reparameterize this path. */
+            return false;
+    }
+
+    return true;
+}
+
 /*
  * reparameterize_pathlist_by_child
  *         Helper function to reparameterize a list of paths by given child rel.
+ *
+ * Returns NIL to indicate failure, so pathlist had better not be NIL.
  */
 static List *
 reparameterize_pathlist_by_child(PlannerInfo *root,
@@ -4363,3 +4530,23 @@ reparameterize_pathlist_by_child(PlannerInfo *root,

     return result;
 }
+
+/*
+ * pathlist_is_reparameterizable_by_child
+ *        Helper function to check if a list of paths can be reparameterized.
+ */
+static bool
+pathlist_is_reparameterizable_by_child(List *pathlist, RelOptInfo *child_rel)
+{
+    ListCell   *lc;
+
+    foreach(lc, pathlist)
+    {
+        Path       *path = (Path *) lfirst(lc);
+
+        if (!path_is_reparameterizable_by_child(path, child_rel))
+            return false;
+    }
+
+    return true;
+}
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 6e557bebc4..f5f8cbcfb4 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -298,6 +298,8 @@ extern Path *reparameterize_path(PlannerInfo *root, Path *path,
                                  double loop_count);
 extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
                                           RelOptInfo *child_rel);
+extern bool path_is_reparameterizable_by_child(Path *path,
+                                               RelOptInfo *child_rel);

 /*
  * prototypes for relnode.c
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 6560fe2416..a11f738411 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -505,6 +505,31 @@ SELECT t1.a, ss.t2a, ss.t2c FROM prt1 t1 LEFT JOIN LATERAL
  550 |     |
 (12 rows)

+-- lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1 t1 JOIN LATERAL
+              (SELECT * FROM prt1 t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s
+              ON t1.a = s.a;
+                         QUERY PLAN
+-------------------------------------------------------------
+ Append
+   ->  Nested Loop
+         ->  Seq Scan on prt1_p1 t1_1
+         ->  Sample Scan on prt1_p1 t2_1
+               Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+               Filter: (t1_1.a = a)
+   ->  Nested Loop
+         ->  Seq Scan on prt1_p2 t1_2
+         ->  Sample Scan on prt1_p2 t2_2
+               Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
+               Filter: (t1_2.a = a)
+   ->  Nested Loop
+         ->  Seq Scan on prt1_p3 t1_3
+         ->  Sample Scan on prt1_p3 t2_3
+               Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
+               Filter: (t1_3.a = a)
+(16 rows)
+
 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
 SET enable_hashjoin TO false;
@@ -1944,6 +1969,41 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
  550 | 0 | 0002 |     |      |     |     |
 (12 rows)

+-- partitionwise join with lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1_l t1 JOIN LATERAL
+              (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
+              t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
+                                       QUERY PLAN
+----------------------------------------------------------------------------------------
+ Append
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p1 t1_1
+         ->  Sample Scan on prt1_l_p1 t2_1
+               Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+               Filter: ((t1_1.a = a) AND (t1_1.b = b) AND ((t1_1.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p2_p1 t1_2
+         ->  Sample Scan on prt1_l_p2_p1 t2_2
+               Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
+               Filter: ((t1_2.a = a) AND (t1_2.b = b) AND ((t1_2.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p2_p2 t1_3
+         ->  Sample Scan on prt1_l_p2_p2 t2_3
+               Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
+               Filter: ((t1_3.a = a) AND (t1_3.b = b) AND ((t1_3.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p3_p1 t1_4
+         ->  Sample Scan on prt1_l_p3_p1 t2_4
+               Sampling: system (t1_4.a) REPEATABLE (t1_4.b)
+               Filter: ((t1_4.a = a) AND (t1_4.b = b) AND ((t1_4.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p3_p2 t1_5
+         ->  Sample Scan on prt1_l_p3_p2 t2_5
+               Sampling: system (t1_5.a) REPEATABLE (t1_5.b)
+               Filter: ((t1_5.a = a) AND (t1_5.b = b) AND ((t1_5.c)::text = (c)::text))
+(26 rows)
+
 -- join with one side empty
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE a = 1 AND a = 2) t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b
ANDt1.b = t2.a AND t1.c = t2.c; 
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index 48daf3aee3..e2daab03fb 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -100,6 +100,12 @@ SELECT t1.a, ss.t2a, ss.t2c FROM prt1 t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t3.a AS t3a, t2.b t2b, t2.c t2c, least(t1.a,t2.a,t3.a) FROM prt1 t2 JOIN prt2 t3 ON
(t2.a= t3.b)) ss 
               ON t1.c = ss.t2c WHERE (t1.b + coalesce(ss.t2b, 0)) = 0 ORDER BY t1.a;

+-- lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1 t1 JOIN LATERAL
+              (SELECT * FROM prt1 t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s
+              ON t1.a = s.a;
+
 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
 SET enable_hashjoin TO false;
@@ -387,6 +393,12 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t2.c AS t2c, t2.b AS t2b, t3.b AS t3b, least(t1.a,t2.a,t3.b) FROM prt1_l t2 JOIN
prt2_lt3 ON (t2.a = t3.b AND t2.c = t3.c)) ss 
               ON t1.a = ss.t2a AND t1.c = ss.t2c WHERE t1.b = 0 ORDER BY t1.a;

+-- partitionwise join with lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1_l t1 JOIN LATERAL
+              (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
+              t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
+
 -- join with one side empty
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE a = 1 AND a = 2) t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b
ANDt1.b = t2.a AND t1.c = t2.c; 
diff -U3 /home/postgres/REL_15/pgsql/src/test/regress/expected/partition_join.out
/home/postgres/REL_15/pgsql/src/test/regress/results/partition_join.out
--- /home/postgres/REL_15/pgsql/src/test/regress/expected/partition_join.out    2023-08-21 15:00:46.578980354 -0400
+++ /home/postgres/REL_15/pgsql/src/test/regress/results/partition_join.out    2023-08-21 15:12:41.109495743 -0400
@@ -123,11 +123,12 @@
                      ->  Seq Scan on prt2_p2 t2_2
                            Filter: (a = 0)
          ->  Nested Loop Left Join
+               Join Filter: (t1_3.a = t2_3.b)
                ->  Seq Scan on prt2_p3 t2_3
                      Filter: (a = 0)
                ->  Index Scan using iprt1_p3_a on prt1_p3 t1_3
                      Index Cond: (a = t2_3.b)
-(20 rows)
+(21 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1 RIGHT JOIN prt2 t2 ON t1.a = t2.b WHERE t2.a = 0 ORDER BY t1.a, t2.b;
   a  |  c   |  b  |  c
@@ -362,30 +363,36 @@
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: (t1_1.a = t2_1.a)
                ->  Seq Scan on prt1_p1 t1_1
                      Filter: (b = 0)
                ->  Nested Loop
+                     Join Filter: (t2_1.a = t3_1.b)
                      ->  Index Only Scan using iprt1_p1_a on prt1_p1 t2_1
                            Index Cond: (a = t1_1.a)
                      ->  Index Scan using iprt2_p1_b on prt2_p1 t3_1
                            Index Cond: (b = t2_1.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_2.a = t2_2.a)
                ->  Seq Scan on prt1_p2 t1_2
                      Filter: (b = 0)
                ->  Nested Loop
+                     Join Filter: (t2_2.a = t3_2.b)
                      ->  Index Only Scan using iprt1_p2_a on prt1_p2 t2_2
                            Index Cond: (a = t1_2.a)
                      ->  Index Scan using iprt2_p2_b on prt2_p2 t3_2
                            Index Cond: (b = t2_2.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_3.a = t2_3.a)
                ->  Seq Scan on prt1_p3 t1_3
                      Filter: (b = 0)
                ->  Nested Loop
+                     Join Filter: (t2_3.a = t3_3.b)
                      ->  Index Only Scan using iprt1_p3_a on prt1_p3 t2_3
                            Index Cond: (a = t1_3.a)
                      ->  Index Scan using iprt2_p3_b on prt2_p3 t3_3
                            Index Cond: (b = t2_3.a)
-(27 rows)
+(33 rows)

 SELECT * FROM prt1 t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t3.a AS t3a, least(t1.a,t2.a,t3.b) FROM prt1 t2 JOIN prt2 t3 ON (t2.a = t3.b)) ss
@@ -468,21 +475,24 @@
 -------------------------------------------------------------
  Append
    ->  Nested Loop
+         Join Filter: (t1_1.a = t2_1.a)
          ->  Seq Scan on prt1_p1 t1_1
          ->  Sample Scan on prt1_p1 t2_1
                Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
                Filter: (t1_1.a = a)
    ->  Nested Loop
+         Join Filter: (t1_2.a = t2_2.a)
          ->  Seq Scan on prt1_p2 t1_2
          ->  Sample Scan on prt1_p2 t2_2
                Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
                Filter: (t1_2.a = a)
    ->  Nested Loop
+         Join Filter: (t1_3.a = t2_3.a)
          ->  Seq Scan on prt1_p3 t1_3
          ->  Sample Scan on prt1_p3 t2_3
                Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
                Filter: (t1_3.a = a)
-(16 rows)
+(19 rows)

 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
@@ -730,6 +740,7 @@
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: (t1_1.a = t2_1.b)
                ->  Hash Right Join
                      Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2))
                      ->  Seq Scan on prt1_p1 t1_1
@@ -739,6 +750,7 @@
                ->  Index Scan using iprt2_p1_b on prt2_p1 t2_1
                      Index Cond: (b = t1_1.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_2.a = t2_2.b)
                ->  Hash Right Join
                      Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2))
                      ->  Seq Scan on prt1_p2 t1_2
@@ -748,6 +760,7 @@
                ->  Index Scan using iprt2_p2_b on prt2_p2 t2_2
                      Index Cond: (b = t1_2.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_3.a = t2_3.b)
                ->  Hash Right Join
                      Hash Cond: (t1_3.a = ((t3_3.a + t3_3.b) / 2))
                      ->  Seq Scan on prt1_p3 t1_3
@@ -756,7 +769,7 @@
                                  Filter: (c = 0)
                ->  Index Scan using iprt2_p3_b on prt2_p3 t2_3
                      Index Cond: (b = t1_3.a)
-(30 rows)
+(33 rows)

 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3
ON(t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b; 
   a  |  c   |  b  |  c   | ?column? | c
@@ -981,6 +994,7 @@
                ->  HashAggregate
                      Group Key: t1_7.b
                      ->  Nested Loop
+                           Join Filter: (t1_7.b = ((t2_3.a + t2_3.b) / 2))
                            ->  Seq Scan on prt2_p3 t1_7
                                  Filter: (a = 0)
                            ->  Index Scan using iprt1_e_p3_ab2 on prt1_e_p3 t2_3
@@ -988,7 +1002,7 @@
                ->  Index Scan using iprt1_p3_a on prt1_p3 t1_4
                      Index Cond: (a = ((t2_3.a + t2_3.b) / 2))
                      Filter: (b = 0)
-(41 rows)
+(42 rows)

 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHERE t1.a = 0 AND t1.b = (t2.a + t2.b)/2)
ANDt1.b = 0 ORDER BY t1.a; 
   a  | b |  c
@@ -1007,6 +1021,7 @@
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop
+               Join Filter: (t1_3.a = t1_6.b)
                ->  HashAggregate
                      Group Key: t1_6.b
                      ->  Hash Semi Join
@@ -1019,6 +1034,7 @@
                      Index Cond: (a = t1_6.b)
                      Filter: (b = 0)
          ->  Nested Loop
+               Join Filter: (t1_4.a = t1_7.b)
                ->  HashAggregate
                      Group Key: t1_7.b
                      ->  Hash Semi Join
@@ -1031,6 +1047,7 @@
                      Index Cond: (a = t1_7.b)
                      Filter: (b = 0)
          ->  Nested Loop
+               Join Filter: (t1_5.a = t1_8.b)
                ->  HashAggregate
                      Group Key: t1_8.b
                      ->  Hash Semi Join
@@ -1042,7 +1059,7 @@
                ->  Index Scan using iprt1_p3_a on prt1_p3 t1_5
                      Index Cond: (a = t1_8.b)
                      Filter: (b = 0)
-(39 rows)
+(42 rows)

 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1
WHEREt1.c = 0)) AND t1.b = 0 ORDER BY t1.a; 
   a  | b |  c
@@ -1861,6 +1878,7 @@
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: ((t1_1.a = t2_1.a) AND ((t1_1.c)::text = (t2_1.c)::text))
                ->  Seq Scan on prt1_l_p1 t1_1
                      Filter: (b = 0)
                ->  Hash Join
@@ -1870,6 +1888,7 @@
                            ->  Seq Scan on prt1_l_p1 t2_1
                                  Filter: ((t1_1.a = a) AND ((t1_1.c)::text = (c)::text))
          ->  Nested Loop Left Join
+               Join Filter: ((t1_2.a = t2_2.a) AND ((t1_2.c)::text = (t2_2.c)::text))
                ->  Seq Scan on prt1_l_p2_p1 t1_2
                      Filter: (b = 0)
                ->  Hash Join
@@ -1879,6 +1898,7 @@
                            ->  Seq Scan on prt1_l_p2_p1 t2_2
                                  Filter: ((t1_2.a = a) AND ((t1_2.c)::text = (c)::text))
          ->  Nested Loop Left Join
+               Join Filter: ((t1_3.a = t2_3.a) AND ((t1_3.c)::text = (t2_3.c)::text))
                ->  Seq Scan on prt1_l_p2_p2 t1_3
                      Filter: (b = 0)
                ->  Hash Join
@@ -1888,6 +1908,7 @@
                            ->  Seq Scan on prt1_l_p2_p2 t2_3
                                  Filter: ((t1_3.a = a) AND ((t1_3.c)::text = (c)::text))
          ->  Nested Loop Left Join
+               Join Filter: ((t1_4.a = t2_5.a) AND ((t1_4.c)::text = (t2_5.c)::text))
                ->  Seq Scan on prt1_l_p3_p1 t1_4
                      Filter: (b = 0)
                ->  Hash Join
@@ -1901,7 +1922,7 @@
                                        Filter: ((t1_4.a = a) AND ((t1_4.c)::text = (c)::text))
                                  ->  Seq Scan on prt1_l_p3_p2 t2_6
                                        Filter: ((t1_4.a = a) AND ((t1_4.c)::text = (c)::text))
-(44 rows)
+(48 rows)

 SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t2.c AS t2c, t2.b AS t2b, t3.b AS t3b, least(t1.a,t2.a,t3.b) FROM prt1_l t2 JOIN
prt2_lt3 ON (t2.a = t3.b AND t2.c = t3.c)) ss 
@@ -1927,35 +1948,40 @@
 SELECT * FROM prt1_l t1 JOIN LATERAL
               (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
               t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
-                                       QUERY PLAN
-----------------------------------------------------------------------------------------
+                                              QUERY PLAN
+------------------------------------------------------------------------------------------------------
  Append
    ->  Nested Loop
+         Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND ((t1_1.c)::text = (t2_1.c)::text))
          ->  Seq Scan on prt1_l_p1 t1_1
          ->  Sample Scan on prt1_l_p1 t2_1
                Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
                Filter: ((t1_1.a = a) AND (t1_1.b = b) AND ((t1_1.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b) AND ((t1_2.c)::text = (t2_2.c)::text))
          ->  Seq Scan on prt1_l_p2_p1 t1_2
          ->  Sample Scan on prt1_l_p2_p1 t2_2
                Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
                Filter: ((t1_2.a = a) AND (t1_2.b = b) AND ((t1_2.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.b = t2_3.b) AND ((t1_3.c)::text = (t2_3.c)::text))
          ->  Seq Scan on prt1_l_p2_p2 t1_3
          ->  Sample Scan on prt1_l_p2_p2 t2_3
                Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
                Filter: ((t1_3.a = a) AND (t1_3.b = b) AND ((t1_3.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_4.a = t2_4.a) AND (t1_4.b = t2_4.b) AND ((t1_4.c)::text = (t2_4.c)::text))
          ->  Seq Scan on prt1_l_p3_p1 t1_4
          ->  Sample Scan on prt1_l_p3_p1 t2_4
                Sampling: system (t1_4.a) REPEATABLE (t1_4.b)
                Filter: ((t1_4.a = a) AND (t1_4.b = b) AND ((t1_4.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_5.a = t2_5.a) AND (t1_5.b = t2_5.b) AND ((t1_5.c)::text = (t2_5.c)::text))
          ->  Seq Scan on prt1_l_p3_p2 t1_5
          ->  Sample Scan on prt1_l_p3_p2 t2_5
                Sampling: system (t1_5.a) REPEATABLE (t1_5.b)
                Filter: ((t1_5.a = a) AND (t1_5.b = b) AND ((t1_5.c)::text = (c)::text))
-(26 rows)
+(31 rows)

 -- join with one side empty
 EXPLAIN (COSTS OFF)
@@ -2242,10 +2268,11 @@
 where not exists (select 1 from prtx2
                   where prtx2.a=prtx1.a and prtx2.b=prtx1.b and prtx2.c=123)
   and a<20 and c=120;
-                         QUERY PLAN
--------------------------------------------------------------
+                                 QUERY PLAN
+----------------------------------------------------------------------------
  Append
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_1.a = prtx1_1.a) AND (prtx2_1.b = prtx1_1.b))
          ->  Seq Scan on prtx1_1
                Filter: ((a < 20) AND (c = 120))
          ->  Bitmap Heap Scan on prtx2_1
@@ -2257,6 +2284,7 @@
                      ->  Bitmap Index Scan on prtx2_1_c_idx
                            Index Cond: (c = 123)
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_2.a = prtx1_2.a) AND (prtx2_2.b = prtx1_2.b))
          ->  Seq Scan on prtx1_2
                Filter: ((a < 20) AND (c = 120))
          ->  Bitmap Heap Scan on prtx2_2
@@ -2267,7 +2295,7 @@
                            Index Cond: (b = prtx1_2.b)
                      ->  Bitmap Index Scan on prtx2_2_c_idx
                            Index Cond: (c = 123)
-(23 rows)
+(25 rows)

 select * from prtx1
 where not exists (select 1 from prtx2
@@ -2283,10 +2311,11 @@
 where not exists (select 1 from prtx2
                   where prtx2.a=prtx1.a and (prtx2.b=prtx1.b+1 or prtx2.c=99))
   and a<20 and c=91;
-                           QUERY PLAN
------------------------------------------------------------------
+                                               QUERY PLAN
+--------------------------------------------------------------------------------------------------------
  Append
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_1.a = prtx1_1.a) AND ((prtx2_1.b = (prtx1_1.b + 1)) OR (prtx2_1.c = 99)))
          ->  Seq Scan on prtx1_1
                Filter: ((a < 20) AND (c = 91))
          ->  Bitmap Heap Scan on prtx2_1
@@ -2298,6 +2327,7 @@
                      ->  Bitmap Index Scan on prtx2_1_c_idx
                            Index Cond: (c = 99)
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_2.a = prtx1_2.a) AND ((prtx2_2.b = (prtx1_2.b + 1)) OR (prtx2_2.c = 99)))
          ->  Seq Scan on prtx1_2
                Filter: ((a < 20) AND (c = 91))
          ->  Bitmap Heap Scan on prtx2_2
@@ -2308,7 +2338,7 @@
                            Index Cond: (b = (prtx1_2.b + 1))
                      ->  Bitmap Index Scan on prtx2_2_c_idx
                            Index Cond: (c = 99)
-(23 rows)
+(25 rows)

 select * from prtx1
 where not exists (select 1 from prtx2
@@ -2803,7 +2833,9 @@
    Sort Key: t1.b, t2.a
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: (t1_1.b = t2_1.a)
                ->  Nested Loop
+                     Join Filter: (t1_1.b = t3_1.a)
                      ->  Seq Scan on prt2_adv_p1 t1_1
                            Filter: (a = 0)
                      ->  Index Scan using prt1_adv_p1_a_idx on prt1_adv_p1 t3_1
@@ -2830,7 +2862,7 @@
                            ->  Hash
                                  ->  Seq Scan on prt2_adv_p3 t1_3
                                        Filter: (a = 0)
-(31 rows)
+(33 rows)

 SELECT t1.b, t1.c, t2.a, t2.c, t3.a, t3.c FROM prt2_adv t1 LEFT JOIN prt1_adv t2 ON (t1.b = t2.a) INNER JOIN prt1_adv
t3ON (t1.b = t3.a) WHERE t1.a = 0 ORDER BY t1.b, t2.a, t3.a; 
   b  |  c   |  a  |  c   |  a  |  c
@@ -4957,14 +4989,16 @@
    ->  Merge Append
          Sort Key: x.id DESC
          ->  Nested Loop Left Join
+               Join Filter: (x_1.id = y_1.id)
                ->  Index Only Scan Backward using fract_t0_pkey on fract_t0 x_1
                ->  Index Only Scan using fract_t0_pkey on fract_t0 y_1
                      Index Cond: (id = x_1.id)
          ->  Nested Loop Left Join
+               Join Filter: (x_2.id = y_2.id)
                ->  Index Only Scan Backward using fract_t1_pkey on fract_t1 x_2
                ->  Index Only Scan using fract_t1_pkey on fract_t1 y_2
                      Index Cond: (id = x_2.id)
-(11 rows)
+(13 rows)

 -- cleanup
 DROP TABLE fract_t;
diff -U3 /home/postgres/REL_15/pgsql/src/test/regress/expected/memoize.out
/home/postgres/REL_15/pgsql/src/test/regress/results/memoize.out
--- /home/postgres/REL_15/pgsql/src/test/regress/expected/memoize.out    2023-01-24 11:53:26.646912951 -0500
+++ /home/postgres/REL_15/pgsql/src/test/regress/results/memoize.out    2023-08-21 15:12:40.813495545 -0400
@@ -215,6 +215,7 @@
 ------------------------------------------------------------------------------------------
  Append (actual rows=32 loops=N)
    ->  Nested Loop (actual rows=16 loops=N)
+         Join Filter: (t1_1.a = t2_1.a)
          ->  Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N)
                Heap Fetches: N
          ->  Memoize (actual rows=4 loops=N)
@@ -225,6 +226,7 @@
                      Index Cond: (a = t1_1.a)
                      Heap Fetches: N
    ->  Nested Loop (actual rows=16 loops=N)
+         Join Filter: (t1_2.a = t2_2.a)
          ->  Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N)
                Heap Fetches: N
          ->  Memoize (actual rows=4 loops=N)
@@ -234,7 +236,7 @@
                ->  Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N)
                      Index Cond: (a = t1_2.a)
                      Heap Fetches: N
-(21 rows)
+(23 rows)

 DROP TABLE prt;
 RESET enable_partitionwise_join;

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: SLRUs in the main buffer pool - Page Header definitions
Next
From: Jeff Davis
Date:
Subject: Re: CREATE FUNCTION ... SEARCH { DEFAULT | SYSTEM | SESSION }