Re: d25ea01275 and partitionwise join - Mailing list pgsql-hackers

From Tom Lane
Subject Re: d25ea01275 and partitionwise join
Date
Msg-id 19407.1586125744@sss.pgh.pa.us
Whole thread Raw
In response to Re: d25ea01275 and partitionwise join  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: d25ea01275 and partitionwise join  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
Amit Langote <amitlangote09@gmail.com> writes:
> Okay, I tried that in the updated patch. I didn't try to come up with
> examples that might break it though.

I looked through this.

* Wasn't excited about inventing makeCoalesceExpr(); the fact that it only
had two potential call sites seemed to make it not worth the trouble.
Plus, as defined, it could not handle the general case of COALESCE, which
can have N arguments; so that seemed like a recipe for confusion.

* I think your logic for building the coalesce combinations was just
wrong.  We need combinations of left-hand inputs with right-hand inputs,
not left-hand with left-hand or right-hand with right-hand.  Also the
nesting should already have happened in the inputs, we don't need to
try to generate it here.  The looping code was pretty messy as well.

* I don't think using partopcintype is necessarily right; that could be
a polymorphic type, for instance.  Safer to copy the type of the input
expressions.  Likely we could have got away with using partcollation,
but for consistency I copied that as well.

* You really need to update the data structure definitional comments
when you make a change like this.

* I did not like putting a test case that requires
enable_partitionwise_aggregate in the partition_join test; that seems
misplaced.  But it's not necessary to the test, is it?

* I did not follow the point of your second test case.  The WHERE
constraint on p1.a allows the planner to strength-reduce the joins,
which is why there's no full join in that explain result, but then
we aren't going to get to this code at all.

I repaired the above in the attached.

I'm actually sort of pleasantly surprised that this worked; I was
not sure that building COALESCEs like this would provide the result
we needed.  But it seems okay -- it does fix the behavior in this
3-way test case, as well as the 4-way join you showed at the top
of the thread.  It's fairly dependent on the fact that the planner
won't rearrange full joins; otherwise, the COALESCE structures we
build here might not match those made at parse time.  But that's
not likely to change anytime soon; and this is hardly the only
place that would break, so I'm not sweating about it.  (I have
some vague ideas about getting rid of the COALESCEs as part of
the Var redefinition I've been muttering about, anyway; there
might be a cleaner fix for this if that happens.)

Anyway, I think this is probably OK for now.  Given that the
nullable_partexprs lists are only used in one place, it's pretty
hard to see how it would break anything.

            regards, tom lane

diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index af1fb48..e1cc11c 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -17,6 +17,7 @@
 #include <limits.h>

 #include "miscadmin.h"
+#include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
@@ -1890,7 +1891,8 @@ set_joinrel_partition_key_exprs(RelOptInfo *joinrel,
                                 RelOptInfo *outer_rel, RelOptInfo *inner_rel,
                                 JoinType jointype)
 {
-    int            partnatts = joinrel->part_scheme->partnatts;
+    PartitionScheme part_scheme = joinrel->part_scheme;
+    int            partnatts = part_scheme->partnatts;

     joinrel->partexprs = (List **) palloc0(sizeof(List *) * partnatts);
     joinrel->nullable_partexprs =
@@ -1899,7 +1901,8 @@ set_joinrel_partition_key_exprs(RelOptInfo *joinrel,
     /*
      * The joinrel's partition expressions are the same as those of the input
      * rels, but we must properly classify them as nullable or not in the
-     * joinrel's output.
+     * joinrel's output.  (Also, we add some more partition expressions if
+     * it's a FULL JOIN.)
      */
     for (int cnt = 0; cnt < partnatts; cnt++)
     {
@@ -1910,6 +1913,7 @@ set_joinrel_partition_key_exprs(RelOptInfo *joinrel,
         const List *inner_null_expr = inner_rel->nullable_partexprs[cnt];
         List       *partexpr = NIL;
         List       *nullable_partexpr = NIL;
+        ListCell   *lc;

         switch (jointype)
         {
@@ -1969,6 +1973,31 @@ set_joinrel_partition_key_exprs(RelOptInfo *joinrel,
                                                 outer_null_expr);
                 nullable_partexpr = list_concat(nullable_partexpr,
                                                 inner_null_expr);
+
+                /*
+                 * Also add CoalesceExprs corresponding to each possible
+                 * full-join output variable (that is, left side coalesced to
+                 * right side), so that we can match equijoin expressions
+                 * using those variables.  We assume no type coercions are
+                 * needed to make the join outputs.
+                 */
+                foreach(lc, list_concat_copy(outer_expr, outer_null_expr))
+                {
+                    Node       *larg = (Node *) lfirst(lc);
+                    ListCell   *lc2;
+
+                    foreach(lc2, list_concat_copy(inner_expr, inner_null_expr))
+                    {
+                        Node       *rarg = (Node *) lfirst(lc2);
+                        CoalesceExpr *c = makeNode(CoalesceExpr);
+
+                        c->coalescetype = exprType(larg);
+                        c->coalescecollid = exprCollation(larg);
+                        c->args = list_make2(larg, rarg);
+                        c->location = -1;
+                        nullable_partexpr = lappend(nullable_partexpr, c);
+                    }
+                }
                 break;

             default:
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 469c686..39c7b2f 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -613,6 +613,9 @@ typedef struct PartitionSchemeData *PartitionScheme;
  * that expression goes in the partexprs[i] list if the base relation
  * is not nullable by this join or any lower outer join, or in the
  * nullable_partexprs[i] list if the base relation is nullable.
+ * Furthermore, FULL JOINs add extra nullable_partexprs expressions
+ * corresponding to COALESCE expressions of the left and right join columns,
+ * to simplify matching join clauses to those lists.
  *----------
  */
 typedef enum RelOptKind
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index b3fbe47..cd60b6a 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -750,6 +750,55 @@ SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2
  550 | 0550 |     |      |     1100 | 0
 (12 rows)

+--
+-- 3-way full join
+--
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b) FULL JOIN prt2 p3(b,a,c) USING (a, b)
+  WHERE a BETWEEN 490 AND 510;
+                                                               QUERY PLAN
                  

+-----------------------------------------------------------------------------------------------------------------------------------------
+ Aggregate
+   ->  Append
+         ->  Hash Full Join
+               Hash Cond: ((COALESCE(prt1_1.a, p2_1.a) = p3_1.a) AND (COALESCE(prt1_1.b, p2_1.b) = p3_1.b))
+               Filter: ((COALESCE(COALESCE(prt1_1.a, p2_1.a), p3_1.a) >= 490) AND (COALESCE(COALESCE(prt1_1.a,
p2_1.a),p3_1.a) <= 510)) 
+               ->  Hash Full Join
+                     Hash Cond: ((prt1_1.a = p2_1.a) AND (prt1_1.b = p2_1.b))
+                     ->  Seq Scan on prt1_p1 prt1_1
+                     ->  Hash
+                           ->  Seq Scan on prt2_p1 p2_1
+               ->  Hash
+                     ->  Seq Scan on prt2_p1 p3_1
+         ->  Hash Full Join
+               Hash Cond: ((COALESCE(prt1_2.a, p2_2.a) = p3_2.a) AND (COALESCE(prt1_2.b, p2_2.b) = p3_2.b))
+               Filter: ((COALESCE(COALESCE(prt1_2.a, p2_2.a), p3_2.a) >= 490) AND (COALESCE(COALESCE(prt1_2.a,
p2_2.a),p3_2.a) <= 510)) 
+               ->  Hash Full Join
+                     Hash Cond: ((prt1_2.a = p2_2.a) AND (prt1_2.b = p2_2.b))
+                     ->  Seq Scan on prt1_p2 prt1_2
+                     ->  Hash
+                           ->  Seq Scan on prt2_p2 p2_2
+               ->  Hash
+                     ->  Seq Scan on prt2_p2 p3_2
+         ->  Hash Full Join
+               Hash Cond: ((COALESCE(prt1_3.a, p2_3.a) = p3_3.a) AND (COALESCE(prt1_3.b, p2_3.b) = p3_3.b))
+               Filter: ((COALESCE(COALESCE(prt1_3.a, p2_3.a), p3_3.a) >= 490) AND (COALESCE(COALESCE(prt1_3.a,
p2_3.a),p3_3.a) <= 510)) 
+               ->  Hash Full Join
+                     Hash Cond: ((prt1_3.a = p2_3.a) AND (prt1_3.b = p2_3.b))
+                     ->  Seq Scan on prt1_p3 prt1_3
+                     ->  Hash
+                           ->  Seq Scan on prt2_p3 p2_3
+               ->  Hash
+                     ->  Seq Scan on prt2_p3 p3_3
+(32 rows)
+
+SELECT COUNT(*) FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b) FULL JOIN prt2 p3(b,a,c) USING (a, b)
+  WHERE a BETWEEN 490 AND 510;
+ count
+-------
+    14
+(1 row)
+
 -- Cases with non-nullable expressions in subquery results;
 -- make sure these go to null as expected
 EXPLAIN (COSTS OFF)
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index 575ba7b..6184bbd 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -145,6 +145,15 @@ EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3
ON(t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b; 
 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3
ON(t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b; 

+--
+-- 3-way full join
+--
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b) FULL JOIN prt2 p3(b,a,c) USING (a, b)
+  WHERE a BETWEEN 490 AND 510;
+SELECT COUNT(*) FROM prt1 FULL JOIN prt2 p2(b,a,c) USING(a,b) FULL JOIN prt2 p3(b,a,c) USING (a, b)
+  WHERE a BETWEEN 490 AND 510;
+
 -- Cases with non-nullable expressions in subquery results;
 -- make sure these go to null as expected
 EXPLAIN (COSTS OFF)

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: range_agg
Next
From: Andres Freund
Date:
Subject: WAL page magic errors (and plenty others) got hard to debug.