Thread: Anybody using get_eclass_for_sort_expr in an extension?

Anybody using get_eclass_for_sort_expr in an extension?

From
Tom Lane
Date:
I looked into bug #8591:
http://www.postgresql.org/message-id/E1Vgk41-00050x-Ck@wrigleys.postgresql.org
and was able to reproduce the problem.  The proximate cause is that
get_eclass_for_sort_expr is wrong to suppose that it can always create new
equivalence class entries with empty em_nullable_relids; in the case
where the call is coming from initialize_mergeclause_eclasses, it's
essential to use the info supplied in restrictinfo->nullable_relids.
Otherwise, quals deduced from the equivalence class may have wrong
nullable_relids settings, allowing them to be pushed to unsafe places
in 9.2 and later.

This is relatively easy to fix as in the attached patch, but I'm wondering
how risky this is to back-patch.  We can deal with the field added to
PlannerInfo by sticking it at the end of the struct in the back branches;
we've done that before and not heard squawks.  However, the signature
change for get_eclass_for_sort_expr() would be problematic if any
third-party code is calling that function directly.  I'm inclined to think
there probably isn't any such code, since none of the existing calls are
in index-specific code or other places that people would have been likely
to copy and modify.  Still, I can't claim that the risk is zero.

We could hack around that in the back branches in more or less ugly ways,
such as by making get_eclass_for_sort_expr() be a wrapper around a new
function.  However, it's far from clear what such a wrapper ought to do
--- it could pass NULL for nullable_relids, which would match the existing
behavior, but the odds seem good that the existing behavior might be wrong
for whatever an extension is trying to do.  And having the code text
diverge between HEAD and back branches isn't so appetizing anyway.

I'm a bit inclined to take the risk of breaking anything that's calling
get_eclass_for_sort_expr() directly.  Thoughts?

            regards, tom lane

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 817b149..b39927e 100644
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outPlannerInfo(StringInfo str, const Pl
*** 1698,1703 ****
--- 1698,1704 ----
      WRITE_UINT_FIELD(query_level);
      WRITE_NODE_FIELD(plan_params);
      WRITE_BITMAPSET_FIELD(all_baserels);
+     WRITE_BITMAPSET_FIELD(nullable_baserels);
      WRITE_NODE_FIELD(join_rel_list);
      WRITE_INT_FIELD(join_cur_level);
      WRITE_NODE_FIELD(init_plans);
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 711b161..baddd34 100644
*** a/src/backend/optimizer/path/equivclass.c
--- b/src/backend/optimizer/path/equivclass.c
*************** add_eq_member(EquivalenceClass *ec, Expr
*** 510,515 ****
--- 510,522 ----
   *      equivalence class it is a member of; if none, optionally build a new
   *      single-member EquivalenceClass for it.
   *
+  * expr is the expression, and nullable_relids is the set of base relids
+  * that are potentially nullable below it.    We actually only care about
+  * the set of such relids that are used in the expression; but for caller
+  * convenience, we perform that intersection step here.  The caller need
+  * only be sure that nullable_relids doesn't omit any nullable rels that
+  * might appear in the expr.
+  *
   * sortref is the SortGroupRef of the originating SortGroupClause, if any,
   * or zero if not.    (It should never be zero if the expression is volatile!)
   *
*************** add_eq_member(EquivalenceClass *ec, Expr
*** 538,543 ****
--- 545,551 ----
  EquivalenceClass *
  get_eclass_for_sort_expr(PlannerInfo *root,
                           Expr *expr,
+                          Relids nullable_relids,
                           List *opfamilies,
                           Oid opcintype,
                           Oid collation,
*************** get_eclass_for_sort_expr(PlannerInfo *ro
*** 545,550 ****
--- 553,559 ----
                           Relids rel,
                           bool create_it)
  {
+     Relids        expr_relids;
      EquivalenceClass *newec;
      EquivalenceMember *newem;
      ListCell   *lc1;
*************** get_eclass_for_sort_expr(PlannerInfo *ro
*** 556,561 ****
--- 565,576 ----
      expr = canonicalize_ec_expression(expr, opcintype, collation);

      /*
+      * Get the precise set of nullable relids appearing in the expression.
+      */
+     expr_relids = pull_varnos((Node *) expr);
+     nullable_relids = bms_intersect(nullable_relids, expr_relids);
+
+     /*
       * Scan through the existing EquivalenceClasses for a match
       */
      foreach(lc1, root->eq_classes)
*************** get_eclass_for_sort_expr(PlannerInfo *ro
*** 629,636 ****
      if (newec->ec_has_volatile && sortref == 0) /* should not happen */
          elog(ERROR, "volatile EquivalenceClass has no sortref");

!     newem = add_eq_member(newec, copyObject(expr), pull_varnos((Node *) expr),
!                           NULL, false, opcintype);

      /*
       * add_eq_member doesn't check for volatile functions, set-returning
--- 644,651 ----
      if (newec->ec_has_volatile && sortref == 0) /* should not happen */
          elog(ERROR, "volatile EquivalenceClass has no sortref");

!     newem = add_eq_member(newec, copyObject(expr), expr_relids,
!                           nullable_relids, false, opcintype);

      /*
       * add_eq_member doesn't check for volatile functions, set-returning
diff --git a/src/backend/optimizer/path/pathkeys.c b/src/backend/optimizer/path/pathkeys.c
index 6724996..f263054 100644
*** a/src/backend/optimizer/path/pathkeys.c
--- b/src/backend/optimizer/path/pathkeys.c
*************** pathkey_is_redundant(PathKey *new_pathke
*** 154,159 ****
--- 154,162 ----
   *      Given an expression and sort-order information, create a PathKey.
   *      The result is always a "canonical" PathKey, but it might be redundant.
   *
+  * expr is the expression, and nullable_relids is the set of base relids
+  * that are potentially nullable below it.
+  *
   * If the PathKey is being generated from a SortGroupClause, sortref should be
   * the SortGroupClause's SortGroupRef; otherwise zero.
   *
*************** pathkey_is_redundant(PathKey *new_pathke
*** 169,174 ****
--- 172,178 ----
  static PathKey *
  make_pathkey_from_sortinfo(PlannerInfo *root,
                             Expr *expr,
+                            Relids nullable_relids,
                             Oid opfamily,
                             Oid opcintype,
                             Oid collation,
*************** make_pathkey_from_sortinfo(PlannerInfo *
*** 204,211 ****
               equality_op);

      /* Now find or (optionally) create a matching EquivalenceClass */
!     eclass = get_eclass_for_sort_expr(root, expr, opfamilies,
!                                       opcintype, collation,
                                        sortref, rel, create_it);

      /* Fail if no EC and !create_it */
--- 208,215 ----
               equality_op);

      /* Now find or (optionally) create a matching EquivalenceClass */
!     eclass = get_eclass_for_sort_expr(root, expr, nullable_relids,
!                                       opfamilies, opcintype, collation,
                                        sortref, rel, create_it);

      /* Fail if no EC and !create_it */
*************** make_pathkey_from_sortinfo(PlannerInfo *
*** 227,232 ****
--- 231,237 ----
  static PathKey *
  make_pathkey_from_sortop(PlannerInfo *root,
                           Expr *expr,
+                          Relids nullable_relids,
                           Oid ordering_op,
                           bool nulls_first,
                           Index sortref,
*************** make_pathkey_from_sortop(PlannerInfo *ro
*** 248,253 ****
--- 253,259 ----

      return make_pathkey_from_sortinfo(root,
                                        expr,
+                                       nullable_relids,
                                        opfamily,
                                        opcintype,
                                        collation,
*************** build_index_pathkeys(PlannerInfo *root,
*** 461,469 ****
              nulls_first = index->nulls_first[i];
          }

!         /* OK, try to make a canonical pathkey for this sort key */
          cpathkey = make_pathkey_from_sortinfo(root,
                                                indexkey,
                                                index->sortopfamily[i],
                                                index->opcintype[i],
                                                index->indexcollations[i],
--- 467,479 ----
              nulls_first = index->nulls_first[i];
          }

!         /*
!          * OK, try to make a canonical pathkey for this sort key.  Note we're
!          * underneath any outer joins, so nullable_relids should be NULL.
!          */
          cpathkey = make_pathkey_from_sortinfo(root,
                                                indexkey,
+                                               NULL,
                                                index->sortopfamily[i],
                                                index->opcintype[i],
                                                index->indexcollations[i],
*************** convert_subquery_pathkeys(PlannerInfo *r
*** 551,561 ****
                   * expression is *not* volatile in the outer query: it's just
                   * a Var referencing whatever the subquery emitted. (IOW, the
                   * outer query isn't going to re-execute the volatile
!                  * expression itself.)    So this is okay.
                   */
                  outer_ec =
                      get_eclass_for_sort_expr(root,
                                               outer_expr,
                                               sub_eclass->ec_opfamilies,
                                               sub_member->em_datatype,
                                               sub_eclass->ec_collation,
--- 561,574 ----
                   * expression is *not* volatile in the outer query: it's just
                   * a Var referencing whatever the subquery emitted. (IOW, the
                   * outer query isn't going to re-execute the volatile
!                  * expression itself.)    So this is okay.  Likewise, it's
!                  * correct to pass nullable_relids = NULL, because we're
!                  * underneath any outer joins appearing in the outer query.
                   */
                  outer_ec =
                      get_eclass_for_sort_expr(root,
                                               outer_expr,
+                                              NULL,
                                               sub_eclass->ec_opfamilies,
                                               sub_member->em_datatype,
                                               sub_eclass->ec_collation,
*************** convert_subquery_pathkeys(PlannerInfo *r
*** 643,648 ****
--- 656,662 ----
                      /* See if we have a matching EC for that */
                      outer_ec = get_eclass_for_sort_expr(root,
                                                          outer_expr,
+                                                         NULL,
                                                     sub_eclass->ec_opfamilies,
                                                          sub_expr_type,
                                                          sub_expr_coll,
*************** build_join_pathkeys(PlannerInfo *root,
*** 748,753 ****
--- 762,774 ----
   * The resulting PathKeys are always in canonical form.  (Actually, there
   * is no longer any code anywhere that creates non-canonical PathKeys.)
   *
+  * We assume that root->nullable_baserels is the set of base relids that
+  * could have gone to NULL below the SortGroupClause expressions.  This is
+  * okay as long as the expressions came from the query's top level (ORDER BY,
+  * DISTINCT, etc) and this function is only invoked after deconstruct_jointree.
+  * In the future we might have to make callers pass in the appropriate
+  * nullable-relids set, but for now it seems unnecessary.
+  *
   * 'sortclauses' is a list of SortGroupClause nodes
   * 'tlist' is the targetlist to find the referenced tlist entries in
   */
*************** make_pathkeys_for_sortclauses(PlannerInf
*** 769,774 ****
--- 790,796 ----
          Assert(OidIsValid(sortcl->sortop));
          pathkey = make_pathkey_from_sortop(root,
                                             sortkey,
+                                            root->nullable_baserels,
                                             sortcl->sortop,
                                             sortcl->nulls_first,
                                             sortcl->tleSortGroupRef,
*************** initialize_mergeclause_eclasses(PlannerI
*** 824,829 ****
--- 846,852 ----
      restrictinfo->left_ec =
          get_eclass_for_sort_expr(root,
                                   (Expr *) get_leftop(clause),
+                                  restrictinfo->nullable_relids,
                                   restrictinfo->mergeopfamilies,
                                   lefttype,
                                   ((OpExpr *) clause)->inputcollid,
*************** initialize_mergeclause_eclasses(PlannerI
*** 833,838 ****
--- 856,862 ----
      restrictinfo->right_ec =
          get_eclass_for_sort_expr(root,
                                   (Expr *) get_rightop(clause),
+                                  restrictinfo->nullable_relids,
                                   restrictinfo->mergeopfamilies,
                                   righttype,
                                   ((OpExpr *) clause)->inputcollid,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index c5998b9..04a399e 100644
*** a/src/backend/optimizer/plan/initsplan.c
--- b/src/backend/optimizer/plan/initsplan.c
*************** deconstruct_jointree(PlannerInfo *root)
*** 649,654 ****
--- 649,657 ----
      Assert(root->parse->jointree != NULL &&
             IsA(root->parse->jointree, FromExpr));

+     /* this is filled as we scan the jointree */
+     root->nullable_baserels = NULL;
+
      result = deconstruct_recurse(root, (Node *) root->parse->jointree, false,
                                   &qualscope, &inner_join_rels,
                                   &postponed_qual_list);
*************** deconstruct_recurse(PlannerInfo *root, N
*** 790,795 ****
--- 793,799 ----
                      left_inners,
                      right_inners,
                      nonnullable_rels,
+                     nullable_rels,
                      ojscope;
          List       *leftjoinlist,
                     *rightjoinlist;
*************** deconstruct_recurse(PlannerInfo *root, N
*** 823,828 ****
--- 827,834 ----
                  *inner_join_rels = *qualscope;
                  /* Inner join adds no restrictions for quals */
                  nonnullable_rels = NULL;
+                 /* and it doesn't force anything to null, either */
+                 nullable_rels = NULL;
                  break;
              case JOIN_LEFT:
              case JOIN_ANTI:
*************** deconstruct_recurse(PlannerInfo *root, N
*** 837,842 ****
--- 843,849 ----
                  *qualscope = bms_union(leftids, rightids);
                  *inner_join_rels = bms_union(left_inners, right_inners);
                  nonnullable_rels = leftids;
+                 nullable_rels = rightids;
                  break;
              case JOIN_SEMI:
                  leftjoinlist = deconstruct_recurse(root, j->larg,
*************** deconstruct_recurse(PlannerInfo *root, N
*** 851,856 ****
--- 858,870 ----
                  *inner_join_rels = bms_union(left_inners, right_inners);
                  /* Semi join adds no restrictions for quals */
                  nonnullable_rels = NULL;
+
+                 /*
+                  * Theoretically, a semijoin would null the RHS; but since the
+                  * RHS can't be accessed above the join, this is immaterial
+                  * and we needn't account for it.
+                  */
+                 nullable_rels = NULL;
                  break;
              case JOIN_FULL:
                  leftjoinlist = deconstruct_recurse(root, j->larg,
*************** deconstruct_recurse(PlannerInfo *root, N
*** 865,880 ****
--- 879,900 ----
                  *inner_join_rels = bms_union(left_inners, right_inners);
                  /* each side is both outer and inner */
                  nonnullable_rels = *qualscope;
+                 nullable_rels = *qualscope;
                  break;
              default:
                  /* JOIN_RIGHT was eliminated during reduce_outer_joins() */
                  elog(ERROR, "unrecognized join type: %d",
                       (int) j->jointype);
                  nonnullable_rels = NULL;        /* keep compiler quiet */
+                 nullable_rels = NULL;
                  leftjoinlist = rightjoinlist = NIL;
                  break;
          }

+         /* Report all rels that will be nulled anywhere in the jointree */
+         root->nullable_baserels = bms_add_members(root->nullable_baserels,
+                                                   nullable_rels);
+
          /*
           * For an OJ, form the SpecialJoinInfo now, because we need the OJ's
           * semantic scope (ojscope) to pass to distribute_qual_to_rels.  But
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index a2853fb..6d7b594 100644
*** a/src/include/nodes/relation.h
--- b/src/include/nodes/relation.h
*************** typedef struct PlannerInfo
*** 150,160 ****
      /*
       * all_baserels is a Relids set of all base relids (but not "other"
       * relids) in the query; that is, the Relids identifier of the final join
!      * we need to form.
       */
      Relids        all_baserels;

      /*
       * join_rel_list is a list of all join-relation RelOptInfos we have
       * considered in this planning run.  For small problems we just scan the
       * list to do lookups, but when there are many join relations we build a
--- 150,169 ----
      /*
       * all_baserels is a Relids set of all base relids (but not "other"
       * relids) in the query; that is, the Relids identifier of the final join
!      * we need to form.  This is computed in make_one_rel, just before we
!      * start making Paths.
       */
      Relids        all_baserels;

      /*
+      * nullable_baserels is a Relids set of base relids that are nullable by
+      * some outer join in the jointree; these are rels that are potentially
+      * nullable below the WHERE clause, SELECT targetlist, etc.  This is
+      * computed in deconstruct_jointree.
+      */
+     Relids        nullable_baserels;
+
+     /*
       * join_rel_list is a list of all join-relation RelOptInfos we have
       * considered in this planning run.  For small problems we just scan the
       * list to do lookups, but when there are many join relations we build a
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9ef93c7..96ffdb1 100644
*** a/src/include/optimizer/paths.h
--- b/src/include/optimizer/paths.h
*************** extern Expr *canonicalize_ec_expression(
*** 109,114 ****
--- 109,115 ----
  extern void reconsider_outer_join_clauses(PlannerInfo *root);
  extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
                           Expr *expr,
+                          Relids nullable_relids,
                           List *opfamilies,
                           Oid opcintype,
                           Oid collation,
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index b59004d..b9559ea 100644
*** a/src/test/regress/expected/join.out
--- b/src/test/regress/expected/join.out
*************** select f1, unique2, case when unique2 is
*** 2901,2906 ****
--- 2901,2935 ----
  (1 row)

  --
+ -- another case with equivalence clauses above outer joins (bug #8591)
+ --
+ explain (costs off)
+ select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
+   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 =
coalesce(b.twothousand,a.twothousand) 
+   where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
+                                          QUERY PLAN
+ ---------------------------------------------------------------------------------------------
+  Nested Loop Left Join
+    ->  Nested Loop Left Join
+          Filter: (COALESCE(b.twothousand, a.twothousand) = 44)
+          ->  Index Scan using tenk1_unique2 on tenk1 a
+                Index Cond: (unique2 = 5530)
+          ->  Bitmap Heap Scan on tenk1 b
+                Recheck Cond: (thousand = a.unique1)
+                ->  Bitmap Index Scan on tenk1_thous_tenthous
+                      Index Cond: (thousand = a.unique1)
+    ->  Index Scan using tenk1_unique2 on tenk1 c
+          Index Cond: ((unique2 = COALESCE(b.twothousand, a.twothousand)) AND (unique2 = 44))
+ (11 rows)
+
+ select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
+   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 =
coalesce(b.twothousand,a.twothousand) 
+   where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
+  unique1 | unique1 | unique1 | coalesce
+ ---------+---------+---------+----------
+ (0 rows)
+
+ --
  -- test ability to push constants through outer join clauses
  --
  explain (costs off)
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index e43b0fc..1edf9b3 100644
*** a/src/test/regress/sql/join.sql
--- b/src/test/regress/sql/join.sql
*************** select f1, unique2, case when unique2 is
*** 791,796 ****
--- 791,809 ----
    where (case when unique2 is null then f1 else 0 end) = 0;

  --
+ -- another case with equivalence clauses above outer joins (bug #8591)
+ --
+
+ explain (costs off)
+ select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
+   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 =
coalesce(b.twothousand,a.twothousand) 
+   where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
+
+ select a.unique1, b.unique1, c.unique1, coalesce(b.twothousand, a.twothousand)
+   from tenk1 a left join tenk1 b on b.thousand = a.unique1                        left join tenk1 c on c.unique2 =
coalesce(b.twothousand,a.twothousand) 
+   where a.unique2 = 5530 and coalesce(b.twothousand, a.twothousand) = 44;
+
+ --
  -- test ability to push constants through outer join clauses
  --


Re: Anybody using get_eclass_for_sort_expr in an extension?

From
Peter Geoghegan
Date:
On Thu, Nov 14, 2013 at 3:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm a bit inclined to take the risk of breaking anything that's calling
> get_eclass_for_sort_expr() directly.  Thoughts?

It's worth being aware of the fact that Peter E's Jenkins instance
seems to track regressions for some popular third party extensions:

http://pgci.eisentraut.org/jenkins/view/Extensions/

Looking at what he is currently testing, these extensions seem
unlikely to fall afoul of your changes. Just something to be aware of
going forward.


-- 
Peter Geoghegan