preptlist.c can insert unprocessed expression trees - Mailing list pgsql-hackers

From Tom Lane
Subject preptlist.c can insert unprocessed expression trees
Date
Msg-id 1865579.1738113656@sss.pgh.pa.us
Whole thread Raw
Responses Re: preptlist.c can insert unprocessed expression trees
List pgsql-hackers
I happened across a not-great behavior in expand_insert_targetlist.
If a column is omitted in an INSERT, and there's no column
default, the code generates a NULL Const to be inserted.
Furthermore, if the column is of a domain type, we wrap the
Const in CoerceToDomain, so as to throw a run-time error if the
domain has a NOT NULL constraint.  That's fine as far as
it goes, but there are two problems:

1. We're being sloppy about the type/typmod that the Const is
labeled with.  It really should have the domain's base type/typmod,
since it's the input to CoerceToDomain not the output.  This can
result in coerce_to_domain inserting a useless length-coercion
function (useless because it's being applied to a null).

2. We're not applying expression preprocessing (specifically,
eval_const_expressions) to the resulting expression tree.
The planner's primary expression-preprocessing pass already happened,
so that means the length coercion step and CoerceToDomain node miss
preprocessing altogether.

You can observe that there is a problem with this script:

-----
create domain d11 as varchar(11);
create table td11 (f1 int, f2 d11);
set debug_print_plan = 1;
insert into td11 values(0, null);
insert into td11 values(0);
-----

Comparing the plan tree dumps for the two INSERTs, the first just
shows a simple NULL Const of the domain type as the source for f2.
But the second shows a NULL Const of the domain type that is fed
to a varchar length-checking function and then to CoerceToDomain.
Of course they really ought to look the same.

When this code was last touched --- over twenty years ago, looks like
--- neither of these oversights meant anything more than a little
inefficiency.  However, as we've loaded more and more responsibility
onto eval_const_expressions, it's turned into what's probably a live
bug.  Specifically, if the length-coercion function call needed
default-argument insertion or named-argument reordering, things would
blow up pretty good.  That's not the case for any in-core data types,
but I wonder whether any extensions create such things.  The authors
wouldn't find out about the issue unless they tried making a domain
on top of their type, so it could have gone unreported.  So I think
this is something that needs to be fixed, and probably back-patched,
even though I don't have a test case that exhibits a crash.
(I actually found this while working on a patch that adds some
more work to eval_const_expressions, so we'll need to deal with
it going forward even if we opt not to back-patch.)

There are a few places in the rewriter that do the same sort of
thing (probably copied-and-pasted from preptlist at some point).
Those are before the planner so the results will get preprocessed
later, but it's still not great if they insert useless length-
coercion calls.  So I felt it was worth writing a utility function to
consolidate all those usages into one copy.  I'm not quite sure about
what to call it though.  In the attached 0001 patch I called it
coerce_null_to_domain and put it in parse_coerce.c.  Another idea
I considered was to consider it as a variant of makeConst and
put it in makefuncs.c.  But that would require makefuncs.c to call
parse_coerce.c which seems like a layering violation.  Anyone have
a better idea?

0001 does result in some cosmetic changes in postgres_fdw's
regression output.  That's because we're now careful to label
the null Const with the column's typmod, which we were not
before.

It seems to me that part of the problem here is that coerce_to_domain
is willing to look up the domain's base type/typmod if the caller
doesn't want to supply it.  With these changes, there's basically
noplace where the caller hasn't already looked that up, and I think
that that's probably required for correct usage.  (The caller has
to produce an input that's of the base type, after all.)  So it seems
like that's not a convenience so much as an encouragement to incorrect
coding.  I propose, for HEAD only, 0002 which removes that misfeature
and requires callers to supply the info.

            regards, tom lane

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 85252cbdbc..7e487cff2a 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -4273,13 +4273,13 @@ EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st6;

 PREPARE st7 AS INSERT INTO ft1 (c1,c2,c3) VALUES (1001,101,'foo');
 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st7;
-                                                                                           QUERY PLAN
                                                                          

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                             QUERY PLAN
                                                                              

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Insert on public.ft1
    Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
    Batch Size: 1
    ->  Result
-         Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time
zone,NULL::character varying, 'ft1       '::character(10), NULL::user_enum 
+         Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time
zone,NULL::character varying(10), 'ft1       '::character(10), NULL::user_enum 
 (5 rows)

 ALTER TABLE "S 1"."T 1" RENAME TO "T 0";
@@ -4307,13 +4307,13 @@ EXECUTE st6;
 (9 rows)

 EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st7;
-                                                                                           QUERY PLAN
                                                                          

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                             QUERY PLAN
                                                                              

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Insert on public.ft1
    Remote SQL: INSERT INTO "S 1"."T 0"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
    Batch Size: 1
    ->  Result
-         Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time
zone,NULL::character varying, 'ft1       '::character(10), NULL::user_enum 
+         Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time
zone,NULL::character varying(10), 'ft1       '::character(10), NULL::user_enum 
 (5 rows)

 ALTER TABLE "S 1"."T 0" RENAME TO "T 1";
@@ -4961,13 +4961,13 @@ SELECT ft1.c1 FROM ft1 JOIN ft2 on ft1.c1 = ft2.c1 WHERE
 -- ===================================================================
 EXPLAIN (verbose, costs off)
 INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20;
-
QUERYPLAN
     

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+
QUERYPLAN
       

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Insert on public.ft2
    Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
    Batch Size: 1
    ->  Subquery Scan on "*SELECT*"
-         Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1", NULL::integer, "*SELECT*"."?column?_2",
NULL::timestampwith time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2       '::character(10),
NULL::user_enum
+         Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1", NULL::integer, "*SELECT*"."?column?_2",
NULL::timestampwith time zone, NULL::timestamp without time zone, NULL::character varying(10), 'ft2
'::character(10),NULL::user_enum 
          ->  Foreign Scan on public.ft2 ft2_1
                Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3)
                Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" LIMIT 20::bigint
@@ -6125,14 +6125,14 @@ SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1;

 EXPLAIN (verbose, costs off)
 INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass;
-                                                                                           QUERY PLAN
                                                                          

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                                             QUERY PLAN
                                                                              

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Insert on public.ft2
    Output: (ft2.tableoid)::regclass
    Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
    Batch Size: 1
    ->  Result
-         Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time
zone,NULL::character varying, 'ft2       '::character(10), NULL::user_enum 
+         Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time
zone,NULL::character varying(10), 'ft2       '::character(10), NULL::user_enum 
 (6 rows)

 INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass;
diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c
index 7c439d1947..c2a77503d4 100644
--- a/src/backend/optimizer/prep/preptlist.c
+++ b/src/backend/optimizer/prep/preptlist.c
@@ -46,7 +46,8 @@
 #include "parser/parsetree.h"
 #include "utils/rel.h"

-static List *expand_insert_targetlist(List *tlist, Relation rel);
+static List *expand_insert_targetlist(PlannerInfo *root, List *tlist,
+                                      Relation rel);


 /*
@@ -102,7 +103,7 @@ preprocess_targetlist(PlannerInfo *root)
      */
     tlist = parse->targetList;
     if (command_type == CMD_INSERT)
-        tlist = expand_insert_targetlist(tlist, target_relation);
+        tlist = expand_insert_targetlist(root, tlist, target_relation);
     else if (command_type == CMD_UPDATE)
         root->update_colnos = extract_update_targetlist_colnos(tlist);

@@ -148,7 +149,8 @@ preprocess_targetlist(PlannerInfo *root)
             ListCell   *l2;

             if (action->commandType == CMD_INSERT)
-                action->targetList = expand_insert_targetlist(action->targetList,
+                action->targetList = expand_insert_targetlist(root,
+                                                              action->targetList,
                                                               target_relation);
             else if (action->commandType == CMD_UPDATE)
                 action->updateColnos =
@@ -376,7 +378,7 @@ extract_update_targetlist_colnos(List *tlist)
  * but now this code is only applied to INSERT targetlists.
  */
 static List *
-expand_insert_targetlist(List *tlist, Relation rel)
+expand_insert_targetlist(PlannerInfo *root, List *tlist, Relation rel)
 {
     List       *new_tlist = NIL;
     ListCell   *tlist_item;
@@ -430,26 +432,18 @@ expand_insert_targetlist(List *tlist, Relation rel)
              * confuse code comparing the finished plan to the target
              * relation, however.
              */
-            Oid            atttype = att_tup->atttypid;
-            Oid            attcollation = att_tup->attcollation;
             Node       *new_expr;

             if (!att_tup->attisdropped)
             {
-                new_expr = (Node *) makeConst(atttype,
-                                              -1,
-                                              attcollation,
-                                              att_tup->attlen,
-                                              (Datum) 0,
-                                              true, /* isnull */
-                                              att_tup->attbyval);
-                new_expr = coerce_to_domain(new_expr,
-                                            InvalidOid, -1,
-                                            atttype,
-                                            COERCION_IMPLICIT,
-                                            COERCE_IMPLICIT_CAST,
-                                            -1,
-                                            false);
+                new_expr = coerce_null_to_domain(att_tup->atttypid,
+                                                 att_tup->atttypmod,
+                                                 att_tup->attcollation,
+                                                 att_tup->attlen,
+                                                 att_tup->attbyval);
+                /* Must run expression preprocessing on any non-const nodes */
+                if (!IsA(new_expr, Const))
+                    new_expr = eval_const_expressions(root, new_expr);
             }
             else
             {
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index a6f4f2fc28..72747be9e2 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -414,6 +414,12 @@ coerce_type(ParseState *pstate, Node *node,
                                      &funcId);
     if (pathtype != COERCION_PATH_NONE)
     {
+        Oid            baseTypeId;
+        int32        baseTypeMod;
+
+        baseTypeMod = targetTypeMod;
+        baseTypeId = getBaseTypeAndTypmod(targetTypeId, &baseTypeMod);
+
         if (pathtype != COERCION_PATH_RELABELTYPE)
         {
             /*
@@ -423,12 +429,6 @@ coerce_type(ParseState *pstate, Node *node,
              * and we need to extract the correct typmod to use from the
              * domain's typtypmod.
              */
-            Oid            baseTypeId;
-            int32        baseTypeMod;
-
-            baseTypeMod = targetTypeMod;
-            baseTypeId = getBaseTypeAndTypmod(targetTypeId, &baseTypeMod);
-
             result = build_coercion_expression(node, pathtype, funcId,
                                                baseTypeId, baseTypeMod,
                                                ccontext, cformat, location);
@@ -454,7 +454,8 @@ coerce_type(ParseState *pstate, Node *node,
              * that must be accounted for.  If the destination is a domain
              * then we won't need a RelabelType node.
              */
-            result = coerce_to_domain(node, InvalidOid, -1, targetTypeId,
+            result = coerce_to_domain(node, baseTypeId, baseTypeMod,
+                                      targetTypeId,
                                       ccontext, cformat, location,
                                       false);
             if (result == node)
@@ -1263,6 +1264,43 @@ coerce_to_specific_type(ParseState *pstate, Node *node,
                                           constructName);
 }

+/*
+ * coerce_null_to_domain()
+ *        Build a NULL constant, then wrap it in CoerceToDomain
+ *        if the desired type is a domain type.  This allows any
+ *        NOT NULL domain constraint to be enforced at runtime.
+ */
+Node *
+coerce_null_to_domain(Oid typid, int32 typmod, Oid collation,
+                      int typlen, bool typbyval)
+{
+    Node       *result;
+    Oid            baseTypeId;
+    int32        baseTypeMod = typmod;
+
+    /*
+     * The constant must appear to have the domain's base type/typmod, else
+     * coerce_to_domain() will apply a length coercion which is useless.
+     */
+    baseTypeId = getBaseTypeAndTypmod(typid, &baseTypeMod);
+    result = (Node *) makeConst(baseTypeId,
+                                baseTypeMod,
+                                collation,
+                                typlen,
+                                (Datum) 0,
+                                true,    /* isnull */
+                                typbyval);
+    if (typid != baseTypeId)
+        result = coerce_to_domain(result,
+                                  baseTypeId, baseTypeMod,
+                                  typid,
+                                  COERCION_IMPLICIT,
+                                  COERCE_IMPLICIT_CAST,
+                                  -1,
+                                  false);
+    return result;
+}
+
 /*
  * parser_coercion_errposition - report coercion error location, if possible
  *
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index b74f2acc32..847edcfa90 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1008,23 +1008,11 @@ rewriteTargetListIU(List *targetList,
                 if (commandType == CMD_INSERT)
                     new_tle = NULL;
                 else
-                {
-                    new_expr = (Node *) makeConst(att_tup->atttypid,
-                                                  -1,
-                                                  att_tup->attcollation,
-                                                  att_tup->attlen,
-                                                  (Datum) 0,
-                                                  true, /* isnull */
-                                                  att_tup->attbyval);
-                    /* this is to catch a NOT NULL domain constraint */
-                    new_expr = coerce_to_domain(new_expr,
-                                                InvalidOid, -1,
-                                                att_tup->atttypid,
-                                                COERCION_IMPLICIT,
-                                                COERCE_IMPLICIT_CAST,
-                                                -1,
-                                                false);
-                }
+                    new_expr = coerce_null_to_domain(att_tup->atttypid,
+                                                     att_tup->atttypmod,
+                                                     att_tup->attcollation,
+                                                     att_tup->attlen,
+                                                     att_tup->attbyval);
             }

             if (new_expr)
@@ -1572,21 +1560,11 @@ rewriteValuesRTE(Query *parsetree, RangeTblEntry *rte, int rti,
                         continue;
                     }

-                    new_expr = (Node *) makeConst(att_tup->atttypid,
-                                                  -1,
-                                                  att_tup->attcollation,
-                                                  att_tup->attlen,
-                                                  (Datum) 0,
-                                                  true, /* isnull */
-                                                  att_tup->attbyval);
-                    /* this is to catch a NOT NULL domain constraint */
-                    new_expr = coerce_to_domain(new_expr,
-                                                InvalidOid, -1,
-                                                att_tup->atttypid,
-                                                COERCION_IMPLICIT,
-                                                COERCE_IMPLICIT_CAST,
-                                                -1,
-                                                false);
+                    new_expr = coerce_null_to_domain(att_tup->atttypid,
+                                                     att_tup->atttypmod,
+                                                     att_tup->attcollation,
+                                                     att_tup->attlen,
+                                                     att_tup->attbyval);
                 }
                 newList = lappend(newList, new_expr);
             }
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index bca11500e9..a115b217c9 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -22,6 +22,7 @@
 #include "parser/parse_relation.h"
 #include "parser/parsetree.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/lsyscache.h"


 typedef struct
@@ -1802,20 +1803,21 @@ ReplaceVarsFromTargetList_callback(Var *var,
                 return (Node *) var;

             case REPLACEVARS_SUBSTITUTE_NULL:
-
-                /*
-                 * If Var is of domain type, we should add a CoerceToDomain
-                 * node, in case there is a NOT NULL domain constraint.
-                 */
-                return coerce_to_domain((Node *) makeNullConst(var->vartype,
-                                                               var->vartypmod,
-                                                               var->varcollid),
-                                        InvalidOid, -1,
-                                        var->vartype,
-                                        COERCION_IMPLICIT,
-                                        COERCE_IMPLICIT_CAST,
-                                        -1,
-                                        false);
+                {
+                    /*
+                     * If Var is of domain type, we must add a CoerceToDomain
+                     * node, in case there is a NOT NULL domain constraint.
+                     */
+                    int16        vartyplen;
+                    bool        vartypbyval;
+
+                    get_typlenbyval(var->vartype, &vartyplen, &vartypbyval);
+                    return coerce_null_to_domain(var->vartype,
+                                                 var->vartypmod,
+                                                 var->varcollid,
+                                                 vartyplen,
+                                                 vartypbyval);
+                }
         }
         elog(ERROR, "could not find replacement targetlist entry for attno %d",
              var->varattno);
diff --git a/src/include/parser/parse_coerce.h b/src/include/parser/parse_coerce.h
index cfded2a8e8..8d775c72c5 100644
--- a/src/include/parser/parse_coerce.h
+++ b/src/include/parser/parse_coerce.h
@@ -63,6 +63,9 @@ extern Node *coerce_to_specific_type_typmod(ParseState *pstate, Node *node,
                                             Oid targetTypeId, int32 targetTypmod,
                                             const char *constructName);

+extern Node *coerce_null_to_domain(Oid typid, int32 typmod, Oid collation,
+                                   int typlen, bool typbyval);
+
 extern int    parser_coercion_errposition(ParseState *pstate,
                                         int coerce_location,
                                         Node *input_expr);
diff --git a/src/backend/parser/parse_coerce.c b/src/backend/parser/parse_coerce.c
index 72747be9e2..0b5b81c7f2 100644
--- a/src/backend/parser/parse_coerce.c
+++ b/src/backend/parser/parse_coerce.c
@@ -661,10 +661,8 @@ can_coerce_type(int nargs, const Oid *input_typeids, const Oid *target_typeids,
  * Create an expression tree to represent coercion to a domain type.
  *
  * 'arg': input expression
- * 'baseTypeId': base type of domain, if known (pass InvalidOid if caller
- *        has not bothered to look this up)
- * 'baseTypeMod': base type typmod of domain, if known (pass -1 if caller
- *        has not bothered to look this up)
+ * 'baseTypeId': base type of domain
+ * 'baseTypeMod': base type typmod of domain
  * 'typeId': target type to coerce to
  * 'ccontext': context indicator to control coercions
  * 'cformat': coercion display format
@@ -680,9 +678,8 @@ coerce_to_domain(Node *arg, Oid baseTypeId, int32 baseTypeMod, Oid typeId,
 {
     CoerceToDomain *result;

-    /* Get the base type if it hasn't been supplied */
-    if (baseTypeId == InvalidOid)
-        baseTypeId = getBaseTypeAndTypmod(typeId, &baseTypeMod);
+    /* We now require the caller to supply correct baseTypeId/baseTypeMod */
+    Assert(OidIsValid(baseTypeId));

     /* If it isn't a domain, return the node as it was passed in */
     if (baseTypeId == typeId)

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Next
From: Peter Smith
Date:
Subject: Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots