Re: Bugs in planner's equivalence-class processing - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bugs in planner's equivalence-class processing
Date
Msg-id 11539.1350515198@sss.pgh.pa.us
Whole thread Raw
In response to Bugs in planner's equivalence-class processing  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> So this is essentially an oversight in the patch that added tracking of
> nullable_relids.  I got confused about the difference between
> outerjoin_delayed (this clause, as a whole, is not outerjoin_delayed
> because its natural semantic level would be at the join anyway) and
> having nonempty nullable_relids, and thought that equivalence-class
> processing would never see such clauses because it doesn't accept
> outerjoin_delayed clauses.  So there's no code in equivclass.c to
> compute nullable_relids sets for constructed clauses.  At some point
> it might be worth adding same, but for now I'm just going to tweak
> distribute_qual_to_rels to not believe that clauses with nonempty
> nullable_relids can be equivalence clauses.

I tried the latter, with the first patch attached below, and it fixed
the incorrect-plan problem.  However, further testing showed that this
method also breaks one of the optimizations implemented in equivclass.c,
which is the ability to push constants through full joins, as in

    select * from tenk1 a full join tenk1 b using (unique1)
    where unique1 = 42;

We don't seem to have a regression test case for that optimization,
which is probably because it predates the availability of EXPLAIN's
COSTS OFF option.  (I've added one in the second patch below.)
I thought for a minute about accepting that as fallout, but I know
for certain that we'd get pushback on it, because we did last time
I broke it :-(.

So it appears that what we have to do is upgrade the equivalence-class
machinery to handle nullable_relids properly.  The second patch attached
below does that.  It's considerably larger than the quick-hack patch,
though not as large as I'd originally feared.

Anyway, the question now is whether to back-patch this.  We do have
pretty much the same equivalence-class infrastructure in all versions
since 8.3, so it's possible to do it ... but it's a bit nervous-making.
On the other hand, letting a known bug go unfixed isn't attractive
either.

            regards, tom lane

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3c7fa632b8ebe26c1d91e83e0ba1fa9c03b9e919..1e1a9b236aeb0672645828b8d7dfd458a6ef3710 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*************** distribute_qual_to_rels(PlannerInfo *roo
*** 988,993 ****
--- 988,1010 ----
              if (check_redundant_nullability_qual(root, clause))
                  return;
          }
+         else if (!bms_is_empty(nullable_relids))
+         {
+             /*
+              * Although the qual doesn't have to be delayed to above its
+              * natural syntactic level, it does contain references to rels
+              * that are nulled by lower outer joins, so don't treat it as a
+              * possible equivalence clause.
+              *
+              * XXX in the future this restriction could be relaxed, but there
+              * are a couple of stumbling blocks.  First, we'd have to verify
+              * that the no-delay property applies to each side of the equality
+              * separately, and second, the equivalence machinery would have to
+              * be improved to create output RestrictInfos with appropriate
+              * (possibly non-empty) nullable_relids values.
+              */
+             maybe_equivalence = false;
+         }
          else
          {
              /*
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 624b7455f365fd1b49c582a368e692cc2f584f74..58606fa95af4f9497d06ce9a95d3e57407e46845 100644
*** a/src/test/regress/expected/join.out
--- b/src/test/regress/expected/join.out
*************** SELECT qq, unique1
*** 2554,2560 ****
           ->  Hash
                 ->  Seq Scan on int8_tbl b
     ->  Index Scan using tenk1_unique2 on tenk1 c
!          Index Cond: (unique2 = COALESCE((COALESCE(a.q1, 0::bigint)), (COALESCE(b.q2, (-1)::bigint))))
  (8 rows)

  SELECT qq, unique1
--- 2554,2560 ----
           ->  Hash
                 ->  Seq Scan on int8_tbl b
     ->  Index Scan using tenk1_unique2 on tenk1 c
!          Index Cond: (COALESCE((COALESCE(a.q1, 0::bigint)), (COALESCE(b.q2, (-1)::bigint))) = unique2)
  (8 rows)

  SELECT qq, unique1
*************** select b.unique1 from
*** 2833,2838 ****
--- 2833,2882 ----
  (5 rows)

  --
+ -- test handling of potential equivalence clauses above outer joins
+ --
+ explain (costs off)
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+                                       QUERY PLAN
+ --------------------------------------------------------------------------------------
+  Nested Loop Left Join
+    Filter: ((COALESCE(b.thousand, 123) = a.q1) AND (a.q1 = COALESCE(b.hundred, 123)))
+    ->  Seq Scan on int8_tbl a
+    ->  Index Scan using tenk1_unique2 on tenk1 b
+          Index Cond: (a.q1 = unique2)
+ (5 rows)
+
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+  q1 | unique2 | thousand | hundred
+ ----+---------+----------+---------
+ (0 rows)
+
+ explain (costs off)
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+                              QUERY PLAN
+ --------------------------------------------------------------------
+  Nested Loop Left Join
+    Filter: (CASE WHEN (b.unique2 IS NULL) THEN a.f1 ELSE 0 END = 0)
+    ->  Seq Scan on int4_tbl a
+    ->  Index Only Scan using tenk1_unique2 on tenk1 b
+          Index Cond: (unique2 = a.f1)
+ (5 rows)
+
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+  f1 | unique2 | case
+ ----+---------+------
+   0 |       0 |    0
+ (1 row)
+
+ --
  -- test join removal
  --
  begin;
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 8676e2f76103a4ade2c130e59956e41245903dca..536c4543ec9cd6fc2e3f175e9cd5fa1ecf8d7ff5 100644
*** a/src/test/regress/sql/join.sql
--- b/src/test/regress/sql/join.sql
*************** select b.unique1 from
*** 750,755 ****
--- 750,777 ----
    order by 1;

  --
+ -- test handling of potential equivalence clauses above outer joins
+ --
+
+ explain (costs off)
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+
+ explain (costs off)
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+
+ --
  -- test join removal
  --

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 02a0f62a53a4e3d06a3ad48d523e959d5d6b2ab7..35c6287bc84e3786276ece8ce7a8425d8a690a78 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outEquivalenceMember(StringInfo str, co
*** 1815,1820 ****
--- 1815,1821 ----

      WRITE_NODE_FIELD(em_expr);
      WRITE_BITMAPSET_FIELD(em_relids);
+     WRITE_BITMAPSET_FIELD(em_nullable_relids);
      WRITE_BOOL_FIELD(em_is_const);
      WRITE_BOOL_FIELD(em_is_child);
      WRITE_OID_FIELD(em_datatype);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 42286a17e817754e54abe21035af590056381751..9bc830bc35c170560cca17ee81165b55a1a92185 100644
*** a/src/backend/optimizer/path/equivclass.c
--- b/src/backend/optimizer/path/equivclass.c
***************
*** 30,36 ****


  static EquivalenceMember *add_eq_member(EquivalenceClass *ec,
!               Expr *expr, Relids relids,
                bool is_child, Oid datatype);
  static void generate_base_implied_equalities_const(PlannerInfo *root,
                                         EquivalenceClass *ec);
--- 30,36 ----


  static EquivalenceMember *add_eq_member(EquivalenceClass *ec,
!               Expr *expr, Relids relids, Relids nullable_relids,
                bool is_child, Oid datatype);
  static void generate_base_implied_equalities_const(PlannerInfo *root,
                                         EquivalenceClass *ec);
*************** process_equivalence(PlannerInfo *root, R
*** 106,112 ****
      Expr       *item1;
      Expr       *item2;
      Relids        item1_relids,
!                 item2_relids;
      List       *opfamilies;
      EquivalenceClass *ec1,
                 *ec2;
--- 106,114 ----
      Expr       *item1;
      Expr       *item2;
      Relids        item1_relids,
!                 item2_relids,
!                 item1_nullable_relids,
!                 item2_nullable_relids;
      List       *opfamilies;
      EquivalenceClass *ec1,
                 *ec2;
*************** process_equivalence(PlannerInfo *root, R
*** 163,168 ****
--- 165,176 ----
              return false;        /* RHS is non-strict but not constant */
      }

+     /* Calculate nullable-relid sets for each side of the clause */
+     item1_nullable_relids = bms_intersect(item1_relids,
+                                           restrictinfo->nullable_relids);
+     item2_nullable_relids = bms_intersect(item2_relids,
+                                           restrictinfo->nullable_relids);
+
      /*
       * We use the declared input types of the operator, not exprType() of the
       * inputs, as the nominal datatypes for opfamily lookup.  This presumes
*************** process_equivalence(PlannerInfo *root, R
*** 309,315 ****
      else if (ec1)
      {
          /* Case 3: add item2 to ec1 */
!         em2 = add_eq_member(ec1, item2, item2_relids, false, item2_type);
          ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo);
          ec1->ec_below_outer_join |= below_outer_join;
          /* mark the RI as associated with this eclass */
--- 317,324 ----
      else if (ec1)
      {
          /* Case 3: add item2 to ec1 */
!         em2 = add_eq_member(ec1, item2, item2_relids, item2_nullable_relids,
!                             false, item2_type);
          ec1->ec_sources = lappend(ec1->ec_sources, restrictinfo);
          ec1->ec_below_outer_join |= below_outer_join;
          /* mark the RI as associated with this eclass */
*************** process_equivalence(PlannerInfo *root, R
*** 322,328 ****
      else if (ec2)
      {
          /* Case 3: add item1 to ec2 */
!         em1 = add_eq_member(ec2, item1, item1_relids, false, item1_type);
          ec2->ec_sources = lappend(ec2->ec_sources, restrictinfo);
          ec2->ec_below_outer_join |= below_outer_join;
          /* mark the RI as associated with this eclass */
--- 331,338 ----
      else if (ec2)
      {
          /* Case 3: add item1 to ec2 */
!         em1 = add_eq_member(ec2, item1, item1_relids, item1_nullable_relids,
!                             false, item1_type);
          ec2->ec_sources = lappend(ec2->ec_sources, restrictinfo);
          ec2->ec_below_outer_join |= below_outer_join;
          /* mark the RI as associated with this eclass */
*************** process_equivalence(PlannerInfo *root, R
*** 349,356 ****
          ec->ec_broken = false;
          ec->ec_sortref = 0;
          ec->ec_merged = NULL;
!         em1 = add_eq_member(ec, item1, item1_relids, false, item1_type);
!         em2 = add_eq_member(ec, item2, item2_relids, false, item2_type);

          root->eq_classes = lappend(root->eq_classes, ec);

--- 359,368 ----
          ec->ec_broken = false;
          ec->ec_sortref = 0;
          ec->ec_merged = NULL;
!         em1 = add_eq_member(ec, item1, item1_relids, item1_nullable_relids,
!                             false, item1_type);
!         em2 = add_eq_member(ec, item2, item2_relids, item2_nullable_relids,
!                             false, item2_type);

          root->eq_classes = lappend(root->eq_classes, ec);

*************** canonicalize_ec_expression(Expr *expr, O
*** 448,459 ****
   */
  static EquivalenceMember *
  add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
!               bool is_child, Oid datatype)
  {
      EquivalenceMember *em = makeNode(EquivalenceMember);

      em->em_expr = expr;
      em->em_relids = relids;
      em->em_is_const = false;
      em->em_is_child = is_child;
      em->em_datatype = datatype;
--- 460,472 ----
   */
  static EquivalenceMember *
  add_eq_member(EquivalenceClass *ec, Expr *expr, Relids relids,
!               Relids nullable_relids, bool is_child, Oid datatype)
  {
      EquivalenceMember *em = makeNode(EquivalenceMember);

      em->em_expr = expr;
      em->em_relids = relids;
+     em->em_nullable_relids = nullable_relids;
      em->em_is_const = false;
      em->em_is_child = is_child;
      em->em_datatype = datatype;
*************** get_eclass_for_sort_expr(PlannerInfo *ro
*** 609,615 ****
          elog(ERROR, "volatile EquivalenceClass has no sortref");

      newem = add_eq_member(newec, copyObject(expr), pull_varnos((Node *) expr),
!                           false, opcintype);

      /*
       * add_eq_member doesn't check for volatile functions, set-returning
--- 622,628 ----
          elog(ERROR, "volatile EquivalenceClass has no sortref");

      newem = add_eq_member(newec, copyObject(expr), pull_varnos((Node *) expr),
!                           NULL, false, opcintype);

      /*
       * add_eq_member doesn't check for volatile functions, set-returning
*************** generate_base_implied_equalities_const(P
*** 789,795 ****
          }
          process_implied_equality(root, eq_op, ec->ec_collation,
                                   cur_em->em_expr, const_em->em_expr,
!                                  ec->ec_relids,
                                   ec->ec_below_outer_join,
                                   cur_em->em_is_const);
      }
--- 802,810 ----
          }
          process_implied_equality(root, eq_op, ec->ec_collation,
                                   cur_em->em_expr, const_em->em_expr,
!                                  bms_copy(ec->ec_relids),
!                                  bms_union(cur_em->em_nullable_relids,
!                                            const_em->em_nullable_relids),
                                   ec->ec_below_outer_join,
                                   cur_em->em_is_const);
      }
*************** generate_base_implied_equalities_no_cons
*** 844,850 ****
              }
              process_implied_equality(root, eq_op, ec->ec_collation,
                                       prev_em->em_expr, cur_em->em_expr,
!                                      ec->ec_relids,
                                       ec->ec_below_outer_join,
                                       false);
          }
--- 859,867 ----
              }
              process_implied_equality(root, eq_op, ec->ec_collation,
                                       prev_em->em_expr, cur_em->em_expr,
!                                      bms_copy(ec->ec_relids),
!                                      bms_union(prev_em->em_nullable_relids,
!                                                cur_em->em_nullable_relids),
                                       ec->ec_below_outer_join,
                                       false);
          }
*************** create_join_clause(PlannerInfo *root,
*** 1312,1318 ****
                                          leftem->em_expr,
                                          rightem->em_expr,
                                          bms_union(leftem->em_relids,
!                                                   rightem->em_relids));

      /* Mark the clause as redundant, or not */
      rinfo->parent_ec = parent_ec;
--- 1329,1337 ----
                                          leftem->em_expr,
                                          rightem->em_expr,
                                          bms_union(leftem->em_relids,
!                                                   rightem->em_relids),
!                                         bms_union(leftem->em_nullable_relids,
!                                                rightem->em_nullable_relids));

      /* Mark the clause as redundant, or not */
      rinfo->parent_ec = parent_ec;
*************** reconsider_outer_join_clause(PlannerInfo
*** 1534,1540 ****
                  left_type,
                  right_type,
                  inner_datatype;
!     Relids        inner_relids;
      ListCell   *lc1;

      Assert(is_opclause(rinfo->clause));
--- 1553,1560 ----
                  left_type,
                  right_type,
                  inner_datatype;
!     Relids        inner_relids,
!                 inner_nullable_relids;
      ListCell   *lc1;

      Assert(is_opclause(rinfo->clause));
*************** reconsider_outer_join_clause(PlannerInfo
*** 1561,1566 ****
--- 1581,1588 ----
          inner_datatype = left_type;
          inner_relids = rinfo->left_relids;
      }
+     inner_nullable_relids = bms_intersect(inner_relids,
+                                           rinfo->nullable_relids);

      /* Scan EquivalenceClasses for a match to outervar */
      foreach(lc1, root->eq_classes)
*************** reconsider_outer_join_clause(PlannerInfo
*** 1619,1625 ****
                                                     cur_ec->ec_collation,
                                                     innervar,
                                                     cur_em->em_expr,
!                                                    inner_relids);
              if (process_equivalence(root, newrinfo, true))
                  match = true;
          }
--- 1641,1648 ----
                                                     cur_ec->ec_collation,
                                                     innervar,
                                                     cur_em->em_expr,
!                                                    bms_copy(inner_relids),
!                                             bms_copy(inner_nullable_relids));
              if (process_equivalence(root, newrinfo, true))
                  match = true;
          }
*************** reconsider_full_join_clause(PlannerInfo
*** 1653,1659 ****
                  left_type,
                  right_type;
      Relids        left_relids,
!                 right_relids;
      ListCell   *lc1;

      /* Can't use an outerjoin_delayed clause here */
--- 1676,1684 ----
                  left_type,
                  right_type;
      Relids        left_relids,
!                 right_relids,
!                 left_nullable_relids,
!                 right_nullable_relids;
      ListCell   *lc1;

      /* Can't use an outerjoin_delayed clause here */
*************** reconsider_full_join_clause(PlannerInfo
*** 1669,1674 ****
--- 1694,1703 ----
      rightvar = (Expr *) get_rightop(rinfo->clause);
      left_relids = rinfo->left_relids;
      right_relids = rinfo->right_relids;
+     left_nullable_relids = bms_intersect(left_relids,
+                                          rinfo->nullable_relids);
+     right_nullable_relids = bms_intersect(right_relids,
+                                           rinfo->nullable_relids);

      foreach(lc1, root->eq_classes)
      {
*************** reconsider_full_join_clause(PlannerInfo
*** 1754,1760 ****
                                                         cur_ec->ec_collation,
                                                         leftvar,
                                                         cur_em->em_expr,
!                                                        left_relids);
                  if (process_equivalence(root, newrinfo, true))
                      matchleft = true;
              }
--- 1783,1790 ----
                                                         cur_ec->ec_collation,
                                                         leftvar,
                                                         cur_em->em_expr,
!                                                        bms_copy(left_relids),
!                                              bms_copy(left_nullable_relids));
                  if (process_equivalence(root, newrinfo, true))
                      matchleft = true;
              }
*************** reconsider_full_join_clause(PlannerInfo
*** 1767,1773 ****
                                                         cur_ec->ec_collation,
                                                         rightvar,
                                                         cur_em->em_expr,
!                                                        right_relids);
                  if (process_equivalence(root, newrinfo, true))
                      matchright = true;
              }
--- 1797,1804 ----
                                                         cur_ec->ec_collation,
                                                         rightvar,
                                                         cur_em->em_expr,
!                                                        bms_copy(right_relids),
!                                             bms_copy(right_nullable_relids));
                  if (process_equivalence(root, newrinfo, true))
                      matchright = true;
              }
*************** add_child_rel_equivalences(PlannerInfo *
*** 1894,1899 ****
--- 1925,1931 ----
                  /* Yes, generate transformed child version */
                  Expr       *child_expr;
                  Relids        new_relids;
+                 Relids        new_nullable_relids;

                  child_expr = (Expr *)
                      adjust_appendrel_attrs(root,
*************** add_child_rel_equivalences(PlannerInfo *
*** 1910,1916 ****
                                              parent_rel->relids);
                  new_relids = bms_add_members(new_relids, child_rel->relids);

!                 (void) add_eq_member(cur_ec, child_expr, new_relids,
                                       true, cur_em->em_datatype);
              }
          }
--- 1942,1962 ----
                                              parent_rel->relids);
                  new_relids = bms_add_members(new_relids, child_rel->relids);

!                 /*
!                  * And likewise for nullable_relids.  Note this code assumes
!                  * parent and child relids are singletons.
!                  */
!                 new_nullable_relids = cur_em->em_nullable_relids;
!                 if (bms_overlap(new_nullable_relids, parent_rel->relids))
!                 {
!                     new_nullable_relids = bms_difference(new_nullable_relids,
!                                                          parent_rel->relids);
!                     new_nullable_relids = bms_add_members(new_nullable_relids,
!                                                           child_rel->relids);
!                 }
!
!                 (void) add_eq_member(cur_ec, child_expr,
!                                      new_relids, new_nullable_relids,
                                       true, cur_em->em_datatype);
              }
          }
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 9565e2d607022feb2126d0897ef1d7000fcba5f6..bd719b57a69f2dc2698990f81574e4ad07532893 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*************** static void distribute_qual_to_rels(Plan
*** 51,59 ****
                          JoinType jointype,
                          Relids qualscope,
                          Relids ojscope,
!                         Relids outerjoin_nonnullable);
  static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p,
                        Relids *nullable_relids_p, bool is_pushed_down);
  static bool check_redundant_nullability_qual(PlannerInfo *root, Node *clause);
  static void check_mergejoinable(RestrictInfo *restrictinfo);
  static void check_hashjoinable(RestrictInfo *restrictinfo);
--- 51,62 ----
                          JoinType jointype,
                          Relids qualscope,
                          Relids ojscope,
!                         Relids outerjoin_nonnullable,
!                         Relids deduced_nullable_relids);
  static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p,
                        Relids *nullable_relids_p, bool is_pushed_down);
+ static bool check_equivalence_delay(PlannerInfo *root,
+                         RestrictInfo *restrictinfo);
  static bool check_redundant_nullability_qual(PlannerInfo *root, Node *clause);
  static void check_mergejoinable(RestrictInfo *restrictinfo);
  static void check_hashjoinable(RestrictInfo *restrictinfo);
*************** deconstruct_recurse(PlannerInfo *root, N
*** 641,647 ****

              distribute_qual_to_rels(root, qual,
                                      false, below_outer_join, JOIN_INNER,
!                                     *qualscope, NULL, NULL);
          }
      }
      else if (IsA(jtnode, JoinExpr))
--- 644,650 ----

              distribute_qual_to_rels(root, qual,
                                      false, below_outer_join, JOIN_INNER,
!                                     *qualscope, NULL, NULL, NULL);
          }
      }
      else if (IsA(jtnode, JoinExpr))
*************** deconstruct_recurse(PlannerInfo *root, N
*** 765,771 ****
              distribute_qual_to_rels(root, qual,
                                      false, below_outer_join, j->jointype,
                                      *qualscope,
!                                     ojscope, nonnullable_rels);
          }

          /* Now we can add the SpecialJoinInfo to join_info_list */
--- 768,774 ----
              distribute_qual_to_rels(root, qual,
                                      false, below_outer_join, j->jointype,
                                      *qualscope,
!                                     ojscope, nonnullable_rels, NULL);
          }

          /* Now we can add the SpecialJoinInfo to join_info_list */
*************** make_outerjoininfo(PlannerInfo *root,
*** 1074,1086 ****
   *        baserels appearing on the outer (nonnullable) side of the join
   *        (for FULL JOIN this includes both sides of the join, and must in fact
   *        equal qualscope)
   *
   * 'qualscope' identifies what level of JOIN the qual came from syntactically.
   * 'ojscope' is needed if we decide to force the qual up to the outer-join
   * level, which will be ojscope not necessarily qualscope.
   *
!  * At the time this is called, root->join_info_list must contain entries for
!  * all and only those special joins that are syntactically below this qual.
   */
  static void
  distribute_qual_to_rels(PlannerInfo *root, Node *clause,
--- 1077,1095 ----
   *        baserels appearing on the outer (nonnullable) side of the join
   *        (for FULL JOIN this includes both sides of the join, and must in fact
   *        equal qualscope)
+  * 'deduced_nullable_relids': if is_deduced is TRUE, the nullable relids to
+  *        impute to the clause; otherwise NULL
   *
   * 'qualscope' identifies what level of JOIN the qual came from syntactically.
   * 'ojscope' is needed if we decide to force the qual up to the outer-join
   * level, which will be ojscope not necessarily qualscope.
   *
!  * In normal use (when is_deduced is FALSE), at the time this is called,
!  * root->join_info_list must contain entries for all and only those special
!  * joins that are syntactically below this qual.  But when is_deduced is TRUE,
!  * we are adding new deduced clauses after completion of deconstruct_jointree,
!  * so it cannot be assumed that root->join_info_list has anything to do with
!  * qual placement.
   */
  static void
  distribute_qual_to_rels(PlannerInfo *root, Node *clause,
*************** distribute_qual_to_rels(PlannerInfo *roo
*** 1089,1095 ****
                          JoinType jointype,
                          Relids qualscope,
                          Relids ojscope,
!                         Relids outerjoin_nonnullable)
  {
      Relids        relids;
      bool        is_pushed_down;
--- 1098,1105 ----
                          JoinType jointype,
                          Relids qualscope,
                          Relids ojscope,
!                         Relids outerjoin_nonnullable,
!                         Relids deduced_nullable_relids)
  {
      Relids        relids;
      bool        is_pushed_down;
*************** distribute_qual_to_rels(PlannerInfo *roo
*** 1211,1222 ****
           * If the qual came from implied-equality deduction, it should not be
           * outerjoin-delayed, else deducer blew it.  But we can't check this
           * because the join_info_list may now contain OJs above where the qual
!          * belongs.
           */
          Assert(!ojscope);
          is_pushed_down = true;
          outerjoin_delayed = false;
!         nullable_relids = NULL;
          /* Don't feed it back for more deductions */
          maybe_equivalence = false;
          maybe_outer_join = false;
--- 1221,1233 ----
           * If the qual came from implied-equality deduction, it should not be
           * outerjoin-delayed, else deducer blew it.  But we can't check this
           * because the join_info_list may now contain OJs above where the qual
!          * belongs.  For the same reason, we must rely on caller to supply the
!          * correct nullable_relids set.
           */
          Assert(!ojscope);
          is_pushed_down = true;
          outerjoin_delayed = false;
!         nullable_relids = deduced_nullable_relids;
          /* Don't feed it back for more deductions */
          maybe_equivalence = false;
          maybe_outer_join = false;
*************** distribute_qual_to_rels(PlannerInfo *roo
*** 1388,1394 ****
      {
          if (maybe_equivalence)
          {
!             if (process_equivalence(root, restrictinfo, below_outer_join))
                  return;
              /* EC rejected it, so set left_ec/right_ec the hard way ... */
              initialize_mergeclause_eclasses(root, restrictinfo);
--- 1399,1406 ----
      {
          if (maybe_equivalence)
          {
!             if (check_equivalence_delay(root, restrictinfo) &&
!                 process_equivalence(root, restrictinfo, below_outer_join))
                  return;
              /* EC rejected it, so set left_ec/right_ec the hard way ... */
              initialize_mergeclause_eclasses(root, restrictinfo);
*************** check_outerjoin_delay(PlannerInfo *root,
*** 1561,1566 ****
--- 1573,1616 ----
  }

  /*
+  * check_equivalence_delay
+  *        Detect whether a potential equivalence clause is rendered unsafe
+  *        by outer-join-delay considerations.  Return TRUE if it's safe.
+  *
+  * The initial tests in distribute_qual_to_rels will consider a mergejoinable
+  * clause to be a potential equivalence clause if it is not outerjoin_delayed.
+  * But since the point of equivalence processing is that we will recombine the
+  * two sides of the clause with others, we have to check that each side
+  * satisfies the not-outerjoin_delayed condition on its own; otherwise it might
+  * not be safe to evaluate everywhere we could place a derived equivalence
+  * condition.
+  */
+ static bool
+ check_equivalence_delay(PlannerInfo *root,
+                         RestrictInfo *restrictinfo)
+ {
+     Relids        relids;
+     Relids        nullable_relids;
+
+     /* fast path if no special joins */
+     if (root->join_info_list == NIL)
+         return true;
+
+     /* must copy restrictinfo's relids to avoid changing it */
+     relids = bms_copy(restrictinfo->left_relids);
+     /* check left side does not need delay */
+     if (check_outerjoin_delay(root, &relids, &nullable_relids, true))
+         return false;
+
+     /* and similarly for the right side */
+     relids = bms_copy(restrictinfo->right_relids);
+     if (check_outerjoin_delay(root, &relids, &nullable_relids, true))
+         return false;
+
+     return true;
+ }
+
+ /*
   * check_redundant_nullability_qual
   *      Check to see if the qual is an IS NULL qual that is redundant with
   *      a lower JOIN_ANTI join.
*************** distribute_restrictinfo_to_rels(PlannerI
*** 1670,1680 ****
--- 1720,1739 ----
   * variable-free.  Otherwise the qual is applied at the lowest join level
   * that provides all its variables.
   *
+  * "nullable_relids" is the set of relids used in the expressions that are
+  * potentially nullable below the expressions.  (This has to be supplied by
+  * caller because this function is used after deconstruct_jointree, so we
+  * don't have knowledge of where the clause items came from.)
+  *
   * "both_const" indicates whether both items are known pseudo-constant;
   * in this case it is worth applying eval_const_expressions() in case we
   * can produce constant TRUE or constant FALSE.  (Otherwise it's not,
   * because the expressions went through eval_const_expressions already.)
   *
+  * Note: this function will copy item1 and item2, but it is caller's
+  * responsibility to make sure that the Relids parameters are fresh copies
+  * not shared with other uses.
+  *
   * This is currently used only when an EquivalenceClass is found to
   * contain pseudoconstants.  See path/pathkeys.c for more details.
   */
*************** process_implied_equality(PlannerInfo *ro
*** 1685,1690 ****
--- 1744,1750 ----
                           Expr *item1,
                           Expr *item2,
                           Relids qualscope,
+                          Relids nullable_relids,
                           bool below_outer_join,
                           bool both_const)
  {
*************** process_implied_equality(PlannerInfo *ro
*** 1718,1732 ****
          }
      }

-     /* Make a copy of qualscope to avoid problems if source EC changes */
-     qualscope = bms_copy(qualscope);
-
      /*
       * Push the new clause into all the appropriate restrictinfo lists.
       */
      distribute_qual_to_rels(root, (Node *) clause,
                              true, below_outer_join, JOIN_INNER,
!                             qualscope, NULL, NULL);
  }

  /*
--- 1778,1789 ----
          }
      }

      /*
       * Push the new clause into all the appropriate restrictinfo lists.
       */
      distribute_qual_to_rels(root, (Node *) clause,
                              true, below_outer_join, JOIN_INNER,
!                             qualscope, NULL, NULL, nullable_relids);
  }

  /*
*************** process_implied_equality(PlannerInfo *ro
*** 1735,1740 ****
--- 1792,1801 ----
   * This overlaps the functionality of process_implied_equality(), but we
   * must return the RestrictInfo, not push it into the joininfo tree.
   *
+  * Note: this function will copy item1 and item2, but it is caller's
+  * responsibility to make sure that the Relids parameters are fresh copies
+  * not shared with other uses.
+  *
   * Note: we do not do initialize_mergeclause_eclasses() here.  It is
   * caller's responsibility that left_ec/right_ec be set as necessary.
   */
*************** build_implied_join_equality(Oid opno,
*** 1743,1749 ****
                              Oid collation,
                              Expr *item1,
                              Expr *item2,
!                             Relids qualscope)
  {
      RestrictInfo *restrictinfo;
      Expr       *clause;
--- 1804,1811 ----
                              Oid collation,
                              Expr *item1,
                              Expr *item2,
!                             Relids qualscope,
!                             Relids nullable_relids)
  {
      RestrictInfo *restrictinfo;
      Expr       *clause;
*************** build_implied_join_equality(Oid opno,
*** 1760,1768 ****
                             InvalidOid,
                             collation);

-     /* Make a copy of qualscope to avoid problems if source EC changes */
-     qualscope = bms_copy(qualscope);
-
      /*
       * Build the RestrictInfo node itself.
       */
--- 1822,1827 ----
*************** build_implied_join_equality(Oid opno,
*** 1772,1778 ****
                                       false,        /* pseudoconstant */
                                       qualscope, /* required_relids */
                                       NULL,        /* outer_relids */
!                                      NULL);        /* nullable_relids */

      /* Set mergejoinability/hashjoinability flags */
      check_mergejoinable(restrictinfo);
--- 1831,1837 ----
                                       false,        /* pseudoconstant */
                                       qualscope, /* required_relids */
                                       NULL,        /* outer_relids */
!                                      nullable_relids);    /* nullable_relids */

      /* Set mergejoinability/hashjoinability flags */
      check_mergejoinable(restrictinfo);
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 2b2742d7ef5bbc1ae2ddd3e2bf81f986e80524d9..0a1f8d5289ec0576895151a780be7248e2ba6c3b 100644
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
*************** typedef struct EquivalenceMember
*** 616,621 ****
--- 616,622 ----

      Expr       *em_expr;        /* the expression represented */
      Relids        em_relids;        /* all relids appearing in em_expr */
+     Relids        em_nullable_relids;        /* nullable by lower outer joins */
      bool        em_is_const;    /* expression is pseudoconstant? */
      bool        em_is_child;    /* derived version for a child relation? */
      Oid            em_datatype;    /* the "nominal type" used by the opfamily */
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index c395d4292c82c69ab3112641417530e8a5dd5a06..0fe696c2dbd63858bfccd7225216814005ace3f2 100644
*** a/src/include/optimizer/planmain.h
--- b/src/include/optimizer/planmain.h
*************** extern void process_implied_equality(Pla
*** 105,117 ****
                           Expr *item1,
                           Expr *item2,
                           Relids qualscope,
                           bool below_outer_join,
                           bool both_const);
  extern RestrictInfo *build_implied_join_equality(Oid opno,
                              Oid collation,
                              Expr *item1,
                              Expr *item2,
!                             Relids qualscope);

  /*
   * prototypes for plan/analyzejoins.c
--- 105,119 ----
                           Expr *item1,
                           Expr *item2,
                           Relids qualscope,
+                          Relids nullable_relids,
                           bool below_outer_join,
                           bool both_const);
  extern RestrictInfo *build_implied_join_equality(Oid opno,
                              Oid collation,
                              Expr *item1,
                              Expr *item2,
!                             Relids qualscope,
!                             Relids nullable_relids);

  /*
   * prototypes for plan/analyzejoins.c
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index d2c41b5e4feac1b24bd8d414252f9a9367061e1d..22265d7a7c88fa3e331588dbc1bbea0803155fbc 100644
*** a/src/test/regress/expected/join.out
--- b/src/test/regress/expected/join.out
*************** select b.unique1 from
*** 2827,2832 ****
--- 2827,2903 ----
  (5 rows)

  --
+ -- test handling of potential equivalence clauses above outer joins
+ --
+ explain (costs off)
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+                                       QUERY PLAN
+ --------------------------------------------------------------------------------------
+  Nested Loop Left Join
+    Filter: ((COALESCE(b.thousand, 123) = a.q1) AND (a.q1 = COALESCE(b.hundred, 123)))
+    ->  Seq Scan on int8_tbl a
+    ->  Index Scan using tenk1_unique2 on tenk1 b
+          Index Cond: (a.q1 = unique2)
+ (5 rows)
+
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+  q1 | unique2 | thousand | hundred
+ ----+---------+----------+---------
+ (0 rows)
+
+ explain (costs off)
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+                              QUERY PLAN
+ --------------------------------------------------------------------
+  Nested Loop Left Join
+    Filter: (CASE WHEN (b.unique2 IS NULL) THEN a.f1 ELSE 0 END = 0)
+    ->  Seq Scan on int4_tbl a
+    ->  Index Only Scan using tenk1_unique2 on tenk1 b
+          Index Cond: (unique2 = a.f1)
+ (5 rows)
+
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+  f1 | unique2 | case
+ ----+---------+------
+   0 |       0 |    0
+ (1 row)
+
+ --
+ -- test ability to push constants through outer join clauses
+ --
+ explain (costs off)
+   select * from int4_tbl a left join tenk1 b on f1 = unique2 where f1 = 0;
+                    QUERY PLAN
+ -------------------------------------------------
+  Nested Loop Left Join
+    Join Filter: (a.f1 = b.unique2)
+    ->  Seq Scan on int4_tbl a
+          Filter: (f1 = 0)
+    ->  Index Scan using tenk1_unique2 on tenk1 b
+          Index Cond: (unique2 = 0)
+ (6 rows)
+
+ explain (costs off)
+   select * from tenk1 a full join tenk1 b using(unique2) where unique2 = 42;
+                    QUERY PLAN
+ -------------------------------------------------
+  Merge Full Join
+    Merge Cond: (a.unique2 = b.unique2)
+    ->  Index Scan using tenk1_unique2 on tenk1 a
+          Index Cond: (unique2 = 42)
+    ->  Index Scan using tenk1_unique2 on tenk1 b
+          Index Cond: (unique2 = 42)
+ (6 rows)
+
+ --
  -- test join removal
  --
  begin;
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index b0cf51cbc8c2277070f9339e013e1d1f2d41d30e..6c1e3394adca36c837bff3d20f62602fafd70f5b 100644
*** a/src/test/regress/sql/join.sql
--- b/src/test/regress/sql/join.sql
*************** select b.unique1 from
*** 750,755 ****
--- 750,787 ----
    order by 1;

  --
+ -- test handling of potential equivalence clauses above outer joins
+ --
+
+ explain (costs off)
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+
+ select q1, unique2, thousand, hundred
+   from int8_tbl a left join tenk1 b on q1 = unique2
+   where coalesce(thousand,123) = q1 and q1 = coalesce(hundred,123);
+
+ explain (costs off)
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+
+ select f1, unique2, case when unique2 is null then f1 else 0 end
+   from int4_tbl a left join tenk1 b on f1 = unique2
+   where (case when unique2 is null then f1 else 0 end) = 0;
+
+ --
+ -- test ability to push constants through outer join clauses
+ --
+
+ explain (costs off)
+   select * from int4_tbl a left join tenk1 b on f1 = unique2 where f1 = 0;
+
+ explain (costs off)
+   select * from tenk1 a full join tenk1 b using(unique2) where unique2 = 42;
+
+ --
  -- test join removal
  --


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Deprecating RULES
Next
From: Daniel Farina
Date:
Subject: Re: Deprecating RULES