Thread: BUG #17610: Use of multiple composite types incompatible with record-typed function parameter

BUG #17610: Use of multiple composite types incompatible with record-typed function parameter

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17610
Logged by:          Martin Jurča
Email address:      mjurca@centrum.cz
PostgreSQL version: 14.5
Operating system:   Linux, Debian 10.2.1-6, x86
Description:

The SQL code at the end of this bug report ends with the following error:

2022-09-08 10:09:02.173 GMT [174] ERROR:  type of parameter 1
(composite_type_2) does not match that when preparing the plan
(composite_type_1)
2022-09-08 10:09:02.173 GMT [174] CONTEXT:  PL/pgSQL function
polymorphic_fuction(record) line 6 at EXECUTE
2022-09-08 10:09:02.173 GMT [174] STATEMENT:  SELECT * FROM
polymorphic_fuction(
    CAST(ROW(2) AS composite_type_2)
    );
ERROR:  42804: type of parameter 1 (composite_type_2) does not match that
when preparing the plan (composite_type_1)
CONTEXT:  PL/pgSQL function polymorphic_fuction(record) line 6 at EXECUTE
LOCATION:  plpgsql_param_eval_generic_ro, pl_exec.c:6648

The expected output is:

 polymorphic_fuction 
---------------------
                   2
(1 row)

If I understand the documentation
(https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING)
correctly, this should work. I noticed that the record type is not listed
among polymorphic types
(https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC),
but PL/pgSQL documentation
(https://www.postgresql.org/docs/current/plpgsql-overview.html#PLPGSQL-ARGS-RESULTS)
states that any composite type can be passed to a record parameter.

Full version info:

PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6)
10.2.1 20210110, 64-bit

I am not sure if this is an error on my part, at the PostgreSQL server, or
in the documentation, but any help or clarification would be appreciated.

SQL code producing the error:

CREATE TYPE composite_type_1 AS (
  num int
);

CREATE TYPE composite_type_2 AS (
  num int
);

CREATE FUNCTION polymorphic_fuction(
  arg record
)
RETURNS int
LANGUAGE plpgsql
AS $$
DECLARE
  result_num int;
BEGIN
  -- Using either SELECT or EXECUTE has the same outcome (this I
understand).
  EXECUTE 'SELECT ($1).num'
  INTO STRICT result_num
  USING arg;
  RETURN result_num;
END;
$$;

SELECT * FROM polymorphic_fuction(
  CAST(ROW(1) AS composite_type_1)
);

SELECT * FROM polymorphic_fuction(
  CAST(ROW(2) AS composite_type_2)
);


On Thu, 08 Sep 2022 at 18:19, PG Bug reporting form <noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      17610
> Logged by:          Martin Jurča
> Email address:      mjurca@centrum.cz
> PostgreSQL version: 14.5
> Operating system:   Linux, Debian 10.2.1-6, x86
> Description:
>
> The SQL code at the end of this bug report ends with the following error:
>
> 2022-09-08 10:09:02.173 GMT [174] ERROR:  type of parameter 1
> (composite_type_2) does not match that when preparing the plan
> (composite_type_1)
> 2022-09-08 10:09:02.173 GMT [174] CONTEXT:  PL/pgSQL function
> polymorphic_fuction(record) line 6 at EXECUTE
> 2022-09-08 10:09:02.173 GMT [174] STATEMENT:  SELECT * FROM
> polymorphic_fuction(
>     CAST(ROW(2) AS composite_type_2)
>     );
> ERROR:  42804: type of parameter 1 (composite_type_2) does not match that
> when preparing the plan (composite_type_1)
> CONTEXT:  PL/pgSQL function polymorphic_fuction(record) line 6 at EXECUTE
> LOCATION:  plpgsql_param_eval_generic_ro, pl_exec.c:6648
>
> The expected output is:
>
>  polymorphic_fuction
> ---------------------
>                    2
> (1 row)
>

The exec_eval_using_params [1] function will handle the parameters, however,
it creates a plan for the expression only the first time [2], for the other
time, it uses the cached plan, which also caches the parameters' type
information, so it causes the problem that you mentation.

> If I understand the documentation
> (https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING)
> correctly, this should work. I noticed that the record type is not listed

The documentation says:

  The mutable nature of record variables presents another problem in this
  connection. When fields of a record variable are used in expressions or
  statements, **the data types of the fields must not change from one call
  of the function to the next**, since each expression will be analyzed
  using the data type that is present when the expression is first reached.

  EXECUTE can be used to get around this problem when necessary.

However, I'am not sure how to use EXECUTE to avoid this problem.


[1]
https://github.com/postgres/postgres/blob/ccd10a9bfa54c1aad3561232bf24222f1b455e1c/src/pl/plpgsql/src/pl_exec.c#L8573
[2]
https://github.com/postgres/postgres/blob/ccd10a9bfa54c1aad3561232bf24222f1b455e1c/src/pl/plpgsql/src/pl_exec.c#L5671

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Japin Li <japinli@hotmail.com> writes:
> On Thu, 08 Sep 2022 at 18:19, PG Bug reporting form <noreply@postgresql.org> wrote:
>> The SQL code at the end of this bug report ends with the following error:
>> 2022-09-08 10:09:02.173 GMT [174] ERROR:  type of parameter 1
>> (composite_type_2) does not match that when preparing the plan
>> (composite_type_1)

> The documentation says:

>   The mutable nature of record variables presents another problem in this
>   connection. When fields of a record variable are used in expressions or
>   statements, **the data types of the fields must not change from one call
>   of the function to the next**, since each expression will be analyzed
>   using the data type that is present when the expression is first reached.

I think that is specifically referring to function internal variables of
type RECORD, which are handled specially.  For function input arguments,
we could get around this by treating RECORD like a polymorphic type, as
attached.  (For some reason I thought we already did that, but nope.)

            regards, tom lane

diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 61fbdf0686..4d68f030e4 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -2498,9 +2498,15 @@ compute_function_hashkey(FunctionCallInfo fcinfo,
 
 /*
  * This is the same as the standard resolve_polymorphic_argtypes() function,
- * but with a special case for validation: assume that polymorphic arguments
- * are integer, integer-array or integer-range.  Also, we go ahead and report
- * the error if we can't resolve the types.
+ * except that:
+ * 1. We go ahead and report the error if we can't resolve the types.
+ * 2. We treat RECORD-type input parameters (not output parameters) as if
+ *    they were polymorphic, replacing their types with the actual input
+ *    types if we can determine those.  This allows us to create a separate
+ *    function cache entry for each named composite type passed to such a
+ *    parameter.
+ * 3. In validation mode, we have no inputs to look at, so assume that
+ *    polymorphic arguments are integer, integer-array or integer-range.
  */
 static void
 plpgsql_resolve_polymorphic_argtypes(int numargs,
@@ -2512,6 +2518,8 @@ plpgsql_resolve_polymorphic_argtypes(int numargs,
 
     if (!forValidator)
     {
+        int            inargno;
+
         /* normal case, pass to standard routine */
         if (!resolve_polymorphic_argtypes(numargs, argtypes, argmodes,
                                           call_expr))
@@ -2520,10 +2528,28 @@ plpgsql_resolve_polymorphic_argtypes(int numargs,
                      errmsg("could not determine actual argument "
                             "type for polymorphic function \"%s\"",
                             proname)));
+        /* also, treat RECORD inputs (but not outputs) as polymorphic */
+        inargno = 0;
+        for (i = 0; i < numargs; i++)
+        {
+            char        argmode = argmodes ? argmodes[i] : PROARGMODE_IN;
+
+            if (argmode == PROARGMODE_OUT || argmode == PROARGMODE_TABLE)
+                continue;
+            if (argtypes[i] == RECORDOID || argtypes[i] == RECORDARRAYOID)
+            {
+                Oid            resolvedtype = get_call_expr_argtype(call_expr,
+                                                                 inargno);
+
+                if (OidIsValid(resolvedtype))
+                    argtypes[i] = resolvedtype;
+            }
+            inargno++;
+        }
     }
     else
     {
-        /* special validation case */
+        /* special validation case (no need to do anything for RECORD) */
         for (i = 0; i < numargs; i++)
         {
             switch (argtypes[i])

I wrote:
> I think that is specifically referring to function internal variables of
> type RECORD, which are handled specially.  For function input arguments,
> we could get around this by treating RECORD like a polymorphic type, as
> attached.  (For some reason I thought we already did that, but nope.)

Oh ... I see why we hadn't noticed this before.  The case works fine
if you refer to the RECORD parameter by name, like this example from
our regression tests:

create function getf1(x record) returns int language plpgsql as
$$ begin return x.f1; end $$;

It does not work fine when you do this:

create function getf1(record) returns int language plpgsql as
$$ begin return $1.f1; end $$;

The reason is that the core parser's callback APIs allow plpgsql
to deal with the "x.f1" construct as a unit, and it can handle
the varying actual type of "x" internally.  However, the callback
APIs are not equivalently intelligent about "$N" references.
Those get resolved as Params of type RECORD (with a separate
FieldSelect on top, in this case), and then we don't have enough
context to figure out which record type is involved, or indeed
that different record types could be involved.

My proposed patch fixes things for the case where the caller
passes a named composite type, but not for passing an anonymous
record, as you can see in the test cases in the attached.
(If you don't apply the code patch, the last getf2() call also
fails, matching the OP's complaint.  But the getf1() calls
still work.)

So really the way we ought to fix this is to upgrade the parser
APIs so that plpgsql could deal with "$1.f1" as a unit.  But
that seems like a lot of work, and it would certainly not be
back-patchable.

In the meantime, I'm uncertain whether we want this change or not.
Duplicating the function cache entry for each composite type
that gets passed during a session is a pretty expensive solution,
especially if it only fixes cases that are being written in a
semi-deprecated fashion.

            regards, tom lane

diff --git a/src/pl/plpgsql/src/expected/plpgsql_record.out b/src/pl/plpgsql/src/expected/plpgsql_record.out
index 4c5d95c79e..afb922df29 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_record.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_record.out
@@ -2,6 +2,7 @@
 -- Tests for PL/pgSQL handling of composite (record) variables
 --
 create type two_int4s as (f1 int4, f2 int4);
+create type more_int4s as (f0 text, f1 int4, f2 int4);
 create type two_int8s as (q1 int8, q2 int8);
 create type nested_int8s as (c1 two_int8s, c2 two_int8s);
 -- base-case return of a composite type
@@ -426,6 +427,18 @@ select getf1(row(1,2));
      1
 (1 row)

+select getf1(row(1,2)::two_int4s);
+ getf1
+-------
+     1
+(1 row)
+
+select getf1(row('foo',123,456)::more_int4s);
+ getf1
+-------
+   123
+(1 row)
+
 -- the context stack is different when debug_discard_caches
 -- is set, so suppress context output
 \set SHOW_CONTEXT never
@@ -438,6 +451,28 @@ select getf1(row(1,2));
      1
 (1 row)

+-- this seemingly-equivalent case behaves a bit differently,
+-- because the core parser's handling of $N symbols is simplistic
+create function getf2(record) returns int language plpgsql as
+$$ begin return $1.f2; end $$;
+select getf2(row(1,2));  -- ideally would work, but does not
+ERROR:  could not identify column "f2" in record data type
+LINE 1: $1.f2
+        ^
+QUERY:  $1.f2
+CONTEXT:  PL/pgSQL function getf2(record) line 1 at RETURN
+select getf2(row(1,2)::two_int4s);
+ getf2
+-------
+     2
+(1 row)
+
+select getf2(row('foo',123,456)::more_int4s);
+ getf2
+-------
+   456
+(1 row)
+
 -- check behavior when assignment to FOR-loop variable requires coercion
 do $$
 declare r two_int8s;
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 61fbdf0686..b286f2a50c 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -2498,9 +2498,15 @@ compute_function_hashkey(FunctionCallInfo fcinfo,

 /*
  * This is the same as the standard resolve_polymorphic_argtypes() function,
- * but with a special case for validation: assume that polymorphic arguments
- * are integer, integer-array or integer-range.  Also, we go ahead and report
- * the error if we can't resolve the types.
+ * except that:
+ * 1. We go ahead and report the error if we can't resolve the types.
+ * 2. We treat RECORD-type input arguments (not output arguments) as if
+ *    they were polymorphic, replacing their types with the actual input
+ *    types if we can determine those.  This allows us to create a separate
+ *    function cache entry for each named composite type passed to such an
+ *    argument.
+ * 3. In validation mode, we have no inputs to look at, so assume that
+ *    polymorphic arguments are integer, integer-array or integer-range.
  */
 static void
 plpgsql_resolve_polymorphic_argtypes(int numargs,
@@ -2512,6 +2518,8 @@ plpgsql_resolve_polymorphic_argtypes(int numargs,

     if (!forValidator)
     {
+        int            inargno;
+
         /* normal case, pass to standard routine */
         if (!resolve_polymorphic_argtypes(numargs, argtypes, argmodes,
                                           call_expr))
@@ -2520,10 +2528,28 @@ plpgsql_resolve_polymorphic_argtypes(int numargs,
                      errmsg("could not determine actual argument "
                             "type for polymorphic function \"%s\"",
                             proname)));
+        /* also, treat RECORD inputs (but not outputs) as polymorphic */
+        inargno = 0;
+        for (i = 0; i < numargs; i++)
+        {
+            char        argmode = argmodes ? argmodes[i] : PROARGMODE_IN;
+
+            if (argmode == PROARGMODE_OUT || argmode == PROARGMODE_TABLE)
+                continue;
+            if (argtypes[i] == RECORDOID || argtypes[i] == RECORDARRAYOID)
+            {
+                Oid            resolvedtype = get_call_expr_argtype(call_expr,
+                                                                 inargno);
+
+                if (OidIsValid(resolvedtype))
+                    argtypes[i] = resolvedtype;
+            }
+            inargno++;
+        }
     }
     else
     {
-        /* special validation case */
+        /* special validation case (no need to do anything for RECORD) */
         for (i = 0; i < numargs; i++)
         {
             switch (argtypes[i])
diff --git a/src/pl/plpgsql/src/sql/plpgsql_record.sql b/src/pl/plpgsql/src/sql/plpgsql_record.sql
index 535a3407a4..db655335b1 100644
--- a/src/pl/plpgsql/src/sql/plpgsql_record.sql
+++ b/src/pl/plpgsql/src/sql/plpgsql_record.sql
@@ -3,6 +3,7 @@
 --

 create type two_int4s as (f1 int4, f2 int4);
+create type more_int4s as (f0 text, f1 int4, f2 int4);
 create type two_int8s as (q1 int8, q2 int8);
 create type nested_int8s as (c1 two_int8s, c2 two_int8s);

@@ -257,6 +258,8 @@ create function getf1(x record) returns int language plpgsql as
 $$ begin return x.f1; end $$;
 select getf1(1);
 select getf1(row(1,2));
+select getf1(row(1,2)::two_int4s);
+select getf1(row('foo',123,456)::more_int4s);
 -- the context stack is different when debug_discard_caches
 -- is set, so suppress context output
 \set SHOW_CONTEXT never
@@ -264,6 +267,14 @@ select getf1(row(1,2)::two_int8s);
 \set SHOW_CONTEXT errors
 select getf1(row(1,2));

+-- this seemingly-equivalent case behaves a bit differently,
+-- because the core parser's handling of $N symbols is simplistic
+create function getf2(record) returns int language plpgsql as
+$$ begin return $1.f2; end $$;
+select getf2(row(1,2));  -- ideally would work, but does not
+select getf2(row(1,2)::two_int4s);
+select getf2(row('foo',123,456)::more_int4s);
+
 -- check behavior when assignment to FOR-loop variable requires coercion
 do $$
 declare r two_int8s;

On Fri, 09 Sep 2022 at 02:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> The reason is that the core parser's callback APIs allow plpgsql
> to deal with the "x.f1" construct as a unit, and it can handle
> the varying actual type of "x" internally.  However, the callback
> APIs are not equivalently intelligent about "$N" references.
> Those get resolved as Params of type RECORD (with a separate
> FieldSelect on top, in this case), and then we don't have enough
> context to figure out which record type is involved, or indeed
> that different record types could be involved.
>
> My proposed patch fixes things for the case where the caller
> passes a named composite type, but not for passing an anonymous
> record, as you can see in the test cases in the attached.
> (If you don't apply the code patch, the last getf2() call also
> fails, matching the OP's complaint.  But the getf1() calls
> still work.)
>
> So really the way we ought to fix this is to upgrade the parser
> APIs so that plpgsql could deal with "$1.f1" as a unit.  But
> that seems like a lot of work, and it would certainly not be
> back-patchable.
>

Tested and looks good!

> In the meantime, I'm uncertain whether we want this change or not.
> Duplicating the function cache entry for each composite type
> that gets passed during a session is a pretty expensive solution,
> especially if it only fixes cases that are being written in a
> semi-deprecated fashion.
>

Yeah, maybe we can document it.  Or is there any way for the user to
discard the cache entry?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Japin Li <japinli@hotmail.com> writes:
> On Fri, 09 Sep 2022 at 02:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So really the way we ought to fix this is to upgrade the parser
>> APIs so that plpgsql could deal with "$1.f1" as a unit.  But
>> that seems like a lot of work, and it would certainly not be
>> back-patchable.

> Tested and looks good!

After letting this bake awhile, I still haven't thought of a better
idea.  Also, even if we wanted to invest the effort to make "$1.f1"
an integrated construct, I think you could still easily break it,
say with syntax like "($1).f1".  The variant "(arg).f1" would be
problematic too without this change.

Hence, pushed as-is.

            regards, tom lane