Thread: 11beta crash/assert caused by parameter type changes

11beta crash/assert caused by parameter type changes

From
Andrew Gierth
Date:
This crashes for me with an assert failure on latest 11-stable (and
12devel):

create or replace function foo(text)
  returns text language plpgsql
  as $$
    declare
        r record;
        t text;
    begin
        execute $1 into r;
        select pg_typeof(r.a) into t;  -- dies on this one
        return format('type %s value %s', t, r.a::text);
    end;
$$;

contrib_regression=# select foo('select 1 as a');
         foo          
----------------------
 type integer value 1
(1 row)

contrib_regression=# select foo('select now() as a');
server closed the connection unexpectedly

The assert is in eval_const_expressions_mutator:

2586               Assert(prm->ptype == param->paramtype);

In 10.x I don't seem to get the assert but it does give a runtime error
from the executor ("type ... does not match that when preparing the plan").

Haven't had time yet to poke at what's changed in detail.

Here's the backtrace:

#3  0x0000000000b42d34 in ExceptionalCondition (conditionName=0xd2765c "!(prm->ptype == param->paramtype)", 
    errorType=0xbc0d4d "FailedAssertion", fileName=0xd27971 "clauses.c", lineNumber=2586) at assert.c:54
#4  0x0000000000882a0e in eval_const_expressions_mutator (node=0x80bea4438, context=0x7fffffffcc18) at clauses.c:2586
#5  0x00000000007d5311 in expression_tree_mutator (node=0x80bea43d8, mutator=0x8828a0 <eval_const_expressions_mutator>,

    context=0x7fffffffcc18) at nodeFuncs.c:2914
#6  0x00000000008868b3 in simplify_function (funcid=1619, result_type=2206, result_typmod=-1, result_collid=0,
input_collid=0,
 
    args_p=0x7fffffffb4d0, funcvariadic=false, process_args=true, allow_non_const=true, context=0x7fffffffcc18) at
clauses.c:4053
#7  0x0000000000882d99 in eval_const_expressions_mutator (node=0x80bea4380, context=0x7fffffffcc18) at clauses.c:2674
#8  0x00000000007d50fa in expression_tree_mutator (node=0x80bea4328, mutator=0x8828a0 <eval_const_expressions_mutator>,

    context=0x7fffffffcc18) at nodeFuncs.c:2865
#9  0x000000000088516a in eval_const_expressions_mutator (node=0x80bea4328, context=0x7fffffffcc18) at clauses.c:3669
#10 0x00000000007d5311 in expression_tree_mutator (node=0x80bea42c8, mutator=0x8828a0 <eval_const_expressions_mutator>,

    context=0x7fffffffcc18) at nodeFuncs.c:2914
#11 0x000000000088516a in eval_const_expressions_mutator (node=0x80bea42c8, context=0x7fffffffcc18) at clauses.c:3669
#12 0x0000000000882891 in eval_const_expressions (root=0x80bea4530, node=0x80bea42c8) at clauses.c:2472
#13 0x000000000085ed22 in preprocess_expression (root=0x80bea4530, expr=0x80bea42c8, kind=1) at planner.c:1041
#14 0x000000000085df1a in subquery_planner (glob=0x80bea4498, parse=0x80bea4178, parent_root=0x0, hasRecursion=false, 
    tuple_fraction=0) at planner.c:732
#15 0x000000000085d34d in standard_planner (parse=0x80bea4178, cursorOptions=256, boundParams=0x802ffa118) at
planner.c:405
#16 0x000000000085d0c0 in planner (parse=0x80bea4178, cursorOptions=256, boundParams=0x802ffa118) at planner.c:263
#17 0x0000000000987270 in pg_plan_query (querytree=0x80bea4178, cursorOptions=256, boundParams=0x802ffa118) at
postgres.c:809
#18 0x00000000009873d1 in pg_plan_queries (querytrees=0x80bea4118, cursorOptions=256, boundParams=0x802ffa118) at
postgres.c:875
#19 0x0000000000b28d3e in BuildCachedPlan (plansource=0x80bea3118, qlist=0x80bea4118, boundParams=0x802ffa118,
queryEnv=0x0)
    at plancache.c:938
#20 0x0000000000b28231 in GetCachedPlan (plansource=0x80bea3118, boundParams=0x802ffa118, useResOwner=true,
queryEnv=0x0)
    at plancache.c:1213
#21 0x00000000007a767b in _SPI_execute_plan (plan=0x80becc118, paramLI=0x802ffa118, snapshot=0x0,
crosscheck_snapshot=0x0,
 
    read_only=false, fire_triggers=true, tcount=1) at spi.c:2127
#22 0x00000000007a81cd in SPI_execute_plan_with_paramlist (plan=0x80becc118, params=0x802ffa118, read_only=false,
tcount=1)
    at spi.c:486
#23 0x000000080c023bfc in exec_stmt_execsql (estate=0x7fffffffd548, stmt=0x80bead5a0) at pl_exec.c:4084
[etc]

-- 
Andrew (irc:RhodiumToad)


Re: 11beta crash/assert caused by parameter type changes

From
Vik Fearing
Date:
On 24/07/18 22:50, Andrew Gierth wrote:
> Haven't had time yet to poke at what's changed in detail.

git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: 11beta crash/assert caused by parameter type changes

From
Robert Haas
Date:
On Tue, Jul 24, 2018 at 5:34 PM, Vik Fearing
<vik.fearing@2ndquadrant.com> wrote:
> On 24/07/18 22:50, Andrew Gierth wrote:
>> Haven't had time yet to poke at what's changed in detail.
>
> git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06.

Sounds like an open item for Tom, then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: 11beta crash/assert caused by parameter type changes

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jul 24, 2018 at 5:34 PM, Vik Fearing
> <vik.fearing@2ndquadrant.com> wrote:
>> git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06.

> Sounds like an open item for Tom, then.

Yeah, agreed, but I've not had time for it yet.

            regards, tom lane


Re: 11beta crash/assert caused by parameter type changes

From
Tom Lane
Date:
Vik Fearing <vik.fearing@2ndquadrant.com> writes:
> On 24/07/18 22:50, Andrew Gierth wrote:
>> Haven't had time yet to poke at what's changed in detail.

> git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06.

Right.  So the reason that this Assert is reachable now is that that
commit made it possible to, as the commit message said, expose the
values of record-field variables to the planner.  All the other places
that call paramFetch hooks either believe the returned "prmtype" or
do run-time checks that it matches what they think it should be.
This one's out of step with a simple assertion, so I think a reasonable
fix would be as attached.

Eventually, of course, we should make such cases Actually Work by forcing
a replan of the cached plan.  But it was not the intent of anything I did
in v11 to make that happen, only to lay a bit more groundwork for it.
So it'd be unreasonable to try to fix that during beta.

I was about to add Andrew's example as a test case (also shown in
attached), but realized that there's a problem: just as noted in
the similar test for named-composite-type changes a bit above there,
the failure fails to fail in CLOBBER_CACHE_ALWAYS builds.  We'd decided
to just omit that test (cf feb1cc559) but having been embarrassed by
this crash I'm feeling like we really could do with some test coverage.
I'm tempted to extract the affected test cases into their own small
test script and provide two variant expected files, one for normal and
one for CLOBBER_CACHE_ALWAYS builds.  Thoughts?

            regards, tom lane

diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 505ae0a..a04ad6e 100644
*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*************** eval_const_expressions_mutator(Node *nod
*** 2567,2573 ****
                      else
                          prm = ¶mLI->params[param->paramid - 1];

!                     if (OidIsValid(prm->ptype))
                      {
                          /* OK to substitute parameter value? */
                          if (context->estimate ||
--- 2567,2580 ----
                      else
                          prm = ¶mLI->params[param->paramid - 1];

!                     /*
!                      * We don't just check OidIsValid, but insist that the
!                      * fetched type match the Param, just in case the hook did
!                      * something unexpected.  No need to throw an error here
!                      * though; leave that for runtime.
!                      */
!                     if (OidIsValid(prm->ptype) &&
!                         prm->ptype == param->paramtype)
                      {
                          /* OK to substitute parameter value? */
                          if (context->estimate ||
*************** eval_const_expressions_mutator(Node *nod
*** 2583,2589 ****
                              bool        typByVal;
                              Datum        pval;

-                             Assert(prm->ptype == param->paramtype);
                              get_typlenbyval(param->paramtype,
                                              &typLen, &typByVal);
                              if (prm->isnull || typByVal)
--- 2590,2595 ----
diff --git a/src/pl/plpgsql/src/expected/plpgsql_record.out b/src/pl/plpgsql/src/expected/plpgsql_record.out
index 6ea88b3..b5129b3 100644
*** a/src/pl/plpgsql/src/expected/plpgsql_record.out
--- b/src/pl/plpgsql/src/expected/plpgsql_record.out
*************** alter table mutable drop column f3;
*** 459,464 ****
--- 459,487 ----
  select getf3(null::mutable);  -- fails again
  ERROR:  record "x" has no field "f3"
  \set SHOW_CONTEXT errors
+ -- check behavior with changes in a record rowtype
+ create function show_result_type(text) returns text language plpgsql as
+ $$
+     declare
+         r record;
+         t text;
+     begin
+         execute $1 into r;
+         select pg_typeof(r.a) into t;
+         return format('type %s value %s', t, r.a::text);
+     end;
+ $$;
+ select show_result_type('select 1 as a');
+    show_result_type
+ ----------------------
+  type integer value 1
+ (1 row)
+
+ -- currently this fails, perhaps someday we can make it succeed
+ select show_result_type('select 2.0 as a');
+ ERROR:  type of parameter 5 (numeric) does not match that when preparing the plan (integer)
+ CONTEXT:  SQL statement "select pg_typeof(r.a)"
+ PL/pgSQL function show_result_type(text) line 7 at SQL statement
  -- check access to system columns in a record variable
  create function sillytrig() returns trigger language plpgsql as
  $$begin
diff --git a/src/pl/plpgsql/src/sql/plpgsql_record.sql b/src/pl/plpgsql/src/sql/plpgsql_record.sql
index aba6887..6f638d6 100644
*** a/src/pl/plpgsql/src/sql/plpgsql_record.sql
--- b/src/pl/plpgsql/src/sql/plpgsql_record.sql
*************** alter table mutable drop column f3;
*** 297,302 ****
--- 297,319 ----
  select getf3(null::mutable);  -- fails again
  \set SHOW_CONTEXT errors

+ -- check behavior with changes in a record rowtype
+ create function show_result_type(text) returns text language plpgsql as
+ $$
+     declare
+         r record;
+         t text;
+     begin
+         execute $1 into r;
+         select pg_typeof(r.a) into t;
+         return format('type %s value %s', t, r.a::text);
+     end;
+ $$;
+
+ select show_result_type('select 1 as a');
+ -- currently this fails, perhaps someday we can make it succeed
+ select show_result_type('select 2.0 as a');
+
  -- check access to system columns in a record variable

  create function sillytrig() returns trigger language plpgsql as

Re: 11beta crash/assert caused by parameter type changes

From
Robert Haas
Date:
On Thu, Jul 26, 2018 at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I was about to add Andrew's example as a test case (also shown in
> attached), but realized that there's a problem: just as noted in
> the similar test for named-composite-type changes a bit above there,
> the failure fails to fail in CLOBBER_CACHE_ALWAYS builds.

Sorry if this is a dumb question, but why does that happen?  I thought
that the idea was that CLOBBER_CACHE_ALWAYS shouldn't change
semantics.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: 11beta crash/assert caused by parameter type changes

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jul 26, 2018 at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I was about to add Andrew's example as a test case (also shown in
>> attached), but realized that there's a problem: just as noted in
>> the similar test for named-composite-type changes a bit above there,
>> the failure fails to fail in CLOBBER_CACHE_ALWAYS builds.

> Sorry if this is a dumb question, but why does that happen?  I thought
> that the idea was that CLOBBER_CACHE_ALWAYS shouldn't change
> semantics.

What we need to fix the problem properly is to forcibly replan a
plancache entry when the type of one of the query's input parameters
changes.  If you turn on CCA, plancache entries are forcibly replanned
at every use.  So, problem goes away (in the most expensive way
possible).

            regards, tom lane


Re: 11beta crash/assert caused by parameter type changes

From
Andres Freund
Date:
Hi,

On 2018-07-26 14:06:16 -0400, Tom Lane wrote:
> I was about to add Andrew's example as a test case (also shown in
> attached), but realized that there's a problem: just as noted in
> the similar test for named-composite-type changes a bit above there,
> the failure fails to fail in CLOBBER_CACHE_ALWAYS builds.  We'd decided
> to just omit that test (cf feb1cc559) but having been embarrassed by
> this crash I'm feeling like we really could do with some test coverage.
> I'm tempted to extract the affected test cases into their own small
> test script and provide two variant expected files, one for normal and
> one for CLOBBER_CACHE_ALWAYS builds.  Thoughts?

Could we perhaps avoid the divergence by prevent a replan, e.g. by using
a cursor and starting the execution of the query before changing the
type?  Otherwise I think I'm ok with having an alternative file, as long
as there's a good comment explaining it.

-Andres


Re: 11beta crash/assert caused by parameter type changes

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2018-07-26 14:06:16 -0400, Tom Lane wrote:
>> I was about to add Andrew's example as a test case (also shown in
>> attached), but realized that there's a problem: just as noted in
>> the similar test for named-composite-type changes a bit above there,
>> the failure fails to fail in CLOBBER_CACHE_ALWAYS builds.  We'd decided
>> to just omit that test (cf feb1cc559) but having been embarrassed by
>> this crash I'm feeling like we really could do with some test coverage.
>> I'm tempted to extract the affected test cases into their own small
>> test script and provide two variant expected files, one for normal and
>> one for CLOBBER_CACHE_ALWAYS builds.  Thoughts?

> Could we perhaps avoid the divergence by prevent a replan, e.g. by using
> a cursor and starting the execution of the query before changing the
> type?  Otherwise I think I'm ok with having an alternative file, as long
> as there's a good comment explaining it.

Well, at the moment the point of having a test would be to ensure that the
code fails cleanly (without crashing) after a type change.  AFAICS your
proposal would amount to not really testing the type-change path at all,
so I don't think it helps.

I was envisioning a file header comment along the lines of

-- Cache-behavior-dependent test cases
--
-- These tests logically belong in plpgsql_record.sql, and perhaps someday
-- can be moved back there.  For now, however, their results are different
-- between regular and CLOBBER_CACHE_ALWAYS builds, so we must have two
-- expected-output files to cover both cases.  To minimize the maintenance
-- effort resulting from that, this file should contain only tests that
-- do have different results under CLOBBER_CACHE_ALWAYS.

            regards, tom lane