Home > mailing lists
Re: constraint exclusion and nulls in IN (..) clause - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: constraint exclusion and nulls in IN (..) clause
Date	March 10, 2018 07:40:16
Msg-id	15634.1520656816@sss.pgh.pa.us Whole thread Raw
In response to	Re: constraint exclusion and nulls in IN (..) clause (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: constraint exclusion and nulls in IN (..) clause Re: constraint exclusion and nulls in IN (..) clause Re: constraint exclusion and nulls in IN (..) clause
List	pgsql-hackers
Tree view
I wrote:
> I think it'd make more sense to see about incorporating that idea in
> predicate_implied_by_simple_clause/predicate_refuted_by_simple_clause.

After further thought, it seems like the place to deal with this is
really operator_predicate_proof(), as in the attached draft patch
against HEAD.  This passes the smell test for me, in the sense that
it's an arguably correct and general extension of the proof rules,
but it could use more testing.

TBH, the change in the existing regression test case in inherit.sql
makes me itch.  We've got

create table list_parted (
    a    varchar
) partition by list (a);
...
create table part_null_xy partition of list_parted for values in (null, 'xy');
...
explain (costs off) select * from list_parted where a = 'ab' or a in (null, 'cd');

Now, the fact that "null" is included in this query's IN clause is a
complete no-op, because the IN is using a strict equality operator.
So it's nice that the planner can see that and realize that there's
no point in scanning part_null_xy ... but this means that the syntax
that's been chosen for list partitioning is seriously misleading.
"in (null, 'xy')" in the CREATE TABLE command has nothing to do with
the semantics of that identical clause in any other context, and indeed
it seems chosen in a way to confuse even (or especially?) seasoned
experts.

I suppose it's too late to do anything about that now, but it sure
seems like NULL should've been handled some other way.

            regards, tom lane

diff --git a/src/backend/optimizer/util/predtest.c b/src/backend/optimizer/util/predtest.c
index fb963d0..446207d 100644
*** a/src/backend/optimizer/util/predtest.c
--- b/src/backend/optimizer/util/predtest.c
*************** static Node *extract_not_arg(Node *claus
*** 100,106 ****
  static Node *extract_strong_not_arg(Node *clause);
  static bool clause_is_strict_for(Node *clause, Node *subexpr);
  static bool operator_predicate_proof(Expr *predicate, Node *clause,
!                          bool refute_it);
  static bool operator_same_subexprs_proof(Oid pred_op, Oid clause_op,
                               bool refute_it);
  static bool operator_same_subexprs_lookup(Oid pred_op, Oid clause_op,
--- 100,106 ----
  static Node *extract_strong_not_arg(Node *clause);
  static bool clause_is_strict_for(Node *clause, Node *subexpr);
  static bool operator_predicate_proof(Expr *predicate, Node *clause,
!                          bool refute_it, bool weak);
  static bool operator_same_subexprs_proof(Oid pred_op, Oid clause_op,
                               bool refute_it);
  static bool operator_same_subexprs_lookup(Oid pred_op, Oid clause_op,
*************** predicate_implied_by_simple_clause(Expr
*** 1137,1143 ****
      }

      /* Else try operator-related knowledge */
!     return operator_predicate_proof(predicate, clause, false);
  }

  /*----------
--- 1137,1143 ----
      }

      /* Else try operator-related knowledge */
!     return operator_predicate_proof(predicate, clause, false, weak);
  }

  /*----------
*************** predicate_refuted_by_simple_clause(Expr
*** 1232,1238 ****
      }

      /* Else try operator-related knowledge */
!     return operator_predicate_proof(predicate, clause, true);
  }


--- 1232,1238 ----
      }

      /* Else try operator-related knowledge */
!     return operator_predicate_proof(predicate, clause, true, weak);
  }


*************** static const StrategyNumber BT_refute_ta
*** 1498,1506 ****
   * values, since then the operators aren't being given identical inputs.  But
   * we only support that for btree operators, for which we can assume that all
   * non-null inputs result in non-null outputs, so that it doesn't matter which
!  * two non-null constants we consider.  Currently the code below just reports
!  * "proof failed" if either constant is NULL, but in some cases we could be
!  * smarter (and that likely would require checking strong vs. weak proofs).
   *
   * We can make proofs involving several expression forms (here "foo" and "bar"
   * represent subexpressions that are identical according to equal()):
--- 1498,1505 ----
   * values, since then the operators aren't being given identical inputs.  But
   * we only support that for btree operators, for which we can assume that all
   * non-null inputs result in non-null outputs, so that it doesn't matter which
!  * two non-null constants we consider.  If either constant is NULL, we have
!  * to think harder, but sometimes the proof still works, as explained below.
   *
   * We can make proofs involving several expression forms (here "foo" and "bar"
   * represent subexpressions that are identical according to equal()):
*************** static const StrategyNumber BT_refute_ta
*** 1528,1534 ****
   * and we dare not make deductions with those.
   */
  static bool
! operator_predicate_proof(Expr *predicate, Node *clause, bool refute_it)
  {
      OpExpr       *pred_opexpr,
                 *clause_opexpr;
--- 1527,1534 ----
   * and we dare not make deductions with those.
   */
  static bool
! operator_predicate_proof(Expr *predicate, Node *clause,
!                          bool refute_it, bool weak)
  {
      OpExpr       *pred_opexpr,
                 *clause_opexpr;
*************** operator_predicate_proof(Expr *predicate
*** 1675,1691 ****
       * We have two identical subexpressions, and two other subexpressions that
       * are not identical but are both Consts; and we have commuted the
       * operators if necessary so that the Consts are on the right.  We'll need
!      * to compare the Consts' values.  If either is NULL, fail.
!      *
!      * Future work: in some cases the desired proof might hold even with NULL
!      * constants.  But beware that we've not yet identified the operators as
!      * btree ops, so for instance it'd be quite unsafe to assume they are
!      * strict without checking.
       */
-     if (pred_const->constisnull)
-         return false;
      if (clause_const->constisnull)
          return false;

      /*
       * Lookup the constant-comparison operator using the system catalogs and
--- 1675,1720 ----
       * We have two identical subexpressions, and two other subexpressions that
       * are not identical but are both Consts; and we have commuted the
       * operators if necessary so that the Consts are on the right.  We'll need
!      * to compare the Consts' values.  If either is NULL, we can't do that, so
!      * usually the proof fails ... but in some cases we can claim success.
       */
      if (clause_const->constisnull)
+     {
+         /* If clause_op isn't strict, we can't prove anything */
+         if (!op_strict(clause_op))
+             return false;
+
+         /*
+          * At this point we know that the clause returns NULL.  For proof
+          * types that assume truth of the clause, this means the proof is
+          * vacuously true (a/k/a "false implies anything").  That's all proof
+          * types except weak implication.
+          */
+         if (!(weak && !refute_it))
+             return true;
+
+         /*
+          * For weak implication, it's still possible for the proof to succeed,
+          * if the predicate can also be proven NULL.  In that case we've got
+          * NULL => NULL which is valid for this proof type.
+          */
+         if (pred_const->constisnull && op_strict(pred_op))
+             return true;
+         /* Else the proof fails */
+         return false;
+     }
+     if (pred_const->constisnull)
+     {
+         /*
+          * If the pred_op is strict, we know the predicate yields NULL, which
+          * means the proof succeeds for either weak implication or weak
+          * refutation.
+          */
+         if (weak && op_strict(pred_op))
+             return true;
+         /* Else the proof fails */
          return false;
+     }

      /*
       * Lookup the constant-comparison operator using the system catalogs and
diff --git a/src/test/modules/test_predtest/expected/test_predtest.out
b/src/test/modules/test_predtest/expected/test_predtest.out
index 9dd8aec..5574e03 100644
*** a/src/test/modules/test_predtest/expected/test_predtest.out
--- b/src/test/modules/test_predtest/expected/test_predtest.out
*************** w_i_holds         | f
*** 781,793 ****
  s_r_holds         | f
  w_r_holds         | f

- -- XXX ideally, we could prove this case too, for strong implication
  select * from test_predtest($$
  select x <= 5, x in (1,3,5,null)
  from integers
  $$);
  -[ RECORD 1 ]-----+--
! strong_implied_by | f
  weak_implied_by   | f
  strong_refuted_by | f
  weak_refuted_by   | f
--- 781,792 ----
  s_r_holds         | f
  w_r_holds         | f

  select * from test_predtest($$
  select x <= 5, x in (1,3,5,null)
  from integers
  $$);
  -[ RECORD 1 ]-----+--
! strong_implied_by | t
  weak_implied_by   | f
  strong_refuted_by | f
  weak_refuted_by   | f
diff --git a/src/test/modules/test_predtest/sql/test_predtest.sql
b/src/test/modules/test_predtest/sql/test_predtest.sql
index 298b8bf..2734735 100644
*** a/src/test/modules/test_predtest/sql/test_predtest.sql
--- b/src/test/modules/test_predtest/sql/test_predtest.sql
*************** select x <= 5, x in (1,3,5,7)
*** 306,312 ****
  from integers
  $$);

- -- XXX ideally, we could prove this case too, for strong implication
  select * from test_predtest($$
  select x <= 5, x in (1,3,5,null)
  from integers
--- 306,311 ----
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index a79f891..d56e5d2 100644
*** a/src/test/regress/expected/inherit.out
--- b/src/test/regress/expected/inherit.out
*************** explain (costs off) select * from list_p
*** 1715,1725 ****
   Append
     ->  Seq Scan on part_ab_cd
           Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
!    ->  Seq Scan on part_ef_gh
!          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
!    ->  Seq Scan on part_null_xy
!          Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
! (7 rows)

  explain (costs off) select * from list_parted where a = 'ab';
                  QUERY PLAN
--- 1715,1721 ----
   Append
     ->  Seq Scan on part_ab_cd
           Filter: (((a)::text = 'ab'::text) OR ((a)::text = ANY ('{NULL,cd}'::text[])))
! (3 rows)

  explain (costs off) select * from list_parted where a = 'ab';
                  QUERY PLAN
pgsql-hackers by date:
From: "David G. Johnston"
Date: 10 March 2018, 05:58:10
Subject: Re: csv format for psql
From: Fabien COELHO
Date: 10 March 2018, 09:45:03
Subject: Re: csv format for psql
Re: constraint exclusion and nulls in IN (..) clause - Mailing list pgsql-hackers

Previous

Next