Thread: 11beta crash/assert caused by parameter type changes
This crashes for me with an assert failure on latest 11-stable (and 12devel): create or replace function foo(text) returns text language plpgsql as $$ declare r record; t text; begin execute $1 into r; select pg_typeof(r.a) into t; -- dies on this one return format('type %s value %s', t, r.a::text); end; $$; contrib_regression=# select foo('select 1 as a'); foo ---------------------- type integer value 1 (1 row) contrib_regression=# select foo('select now() as a'); server closed the connection unexpectedly The assert is in eval_const_expressions_mutator: 2586 Assert(prm->ptype == param->paramtype); In 10.x I don't seem to get the assert but it does give a runtime error from the executor ("type ... does not match that when preparing the plan"). Haven't had time yet to poke at what's changed in detail. Here's the backtrace: #3 0x0000000000b42d34 in ExceptionalCondition (conditionName=0xd2765c "!(prm->ptype == param->paramtype)", errorType=0xbc0d4d "FailedAssertion", fileName=0xd27971 "clauses.c", lineNumber=2586) at assert.c:54 #4 0x0000000000882a0e in eval_const_expressions_mutator (node=0x80bea4438, context=0x7fffffffcc18) at clauses.c:2586 #5 0x00000000007d5311 in expression_tree_mutator (node=0x80bea43d8, mutator=0x8828a0 <eval_const_expressions_mutator>, context=0x7fffffffcc18) at nodeFuncs.c:2914 #6 0x00000000008868b3 in simplify_function (funcid=1619, result_type=2206, result_typmod=-1, result_collid=0, input_collid=0, args_p=0x7fffffffb4d0, funcvariadic=false, process_args=true, allow_non_const=true, context=0x7fffffffcc18) at clauses.c:4053 #7 0x0000000000882d99 in eval_const_expressions_mutator (node=0x80bea4380, context=0x7fffffffcc18) at clauses.c:2674 #8 0x00000000007d50fa in expression_tree_mutator (node=0x80bea4328, mutator=0x8828a0 <eval_const_expressions_mutator>, context=0x7fffffffcc18) at nodeFuncs.c:2865 #9 0x000000000088516a in eval_const_expressions_mutator (node=0x80bea4328, context=0x7fffffffcc18) at clauses.c:3669 #10 0x00000000007d5311 in expression_tree_mutator (node=0x80bea42c8, mutator=0x8828a0 <eval_const_expressions_mutator>, context=0x7fffffffcc18) at nodeFuncs.c:2914 #11 0x000000000088516a in eval_const_expressions_mutator (node=0x80bea42c8, context=0x7fffffffcc18) at clauses.c:3669 #12 0x0000000000882891 in eval_const_expressions (root=0x80bea4530, node=0x80bea42c8) at clauses.c:2472 #13 0x000000000085ed22 in preprocess_expression (root=0x80bea4530, expr=0x80bea42c8, kind=1) at planner.c:1041 #14 0x000000000085df1a in subquery_planner (glob=0x80bea4498, parse=0x80bea4178, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:732 #15 0x000000000085d34d in standard_planner (parse=0x80bea4178, cursorOptions=256, boundParams=0x802ffa118) at planner.c:405 #16 0x000000000085d0c0 in planner (parse=0x80bea4178, cursorOptions=256, boundParams=0x802ffa118) at planner.c:263 #17 0x0000000000987270 in pg_plan_query (querytree=0x80bea4178, cursorOptions=256, boundParams=0x802ffa118) at postgres.c:809 #18 0x00000000009873d1 in pg_plan_queries (querytrees=0x80bea4118, cursorOptions=256, boundParams=0x802ffa118) at postgres.c:875 #19 0x0000000000b28d3e in BuildCachedPlan (plansource=0x80bea3118, qlist=0x80bea4118, boundParams=0x802ffa118, queryEnv=0x0) at plancache.c:938 #20 0x0000000000b28231 in GetCachedPlan (plansource=0x80bea3118, boundParams=0x802ffa118, useResOwner=true, queryEnv=0x0) at plancache.c:1213 #21 0x00000000007a767b in _SPI_execute_plan (plan=0x80becc118, paramLI=0x802ffa118, snapshot=0x0, crosscheck_snapshot=0x0, read_only=false, fire_triggers=true, tcount=1) at spi.c:2127 #22 0x00000000007a81cd in SPI_execute_plan_with_paramlist (plan=0x80becc118, params=0x802ffa118, read_only=false, tcount=1) at spi.c:486 #23 0x000000080c023bfc in exec_stmt_execsql (estate=0x7fffffffd548, stmt=0x80bead5a0) at pl_exec.c:4084 [etc] -- Andrew (irc:RhodiumToad)
On 24/07/18 22:50, Andrew Gierth wrote: > Haven't had time yet to poke at what's changed in detail. git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Tue, Jul 24, 2018 at 5:34 PM, Vik Fearing <vik.fearing@2ndquadrant.com> wrote: > On 24/07/18 22:50, Andrew Gierth wrote: >> Haven't had time yet to poke at what's changed in detail. > > git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06. Sounds like an open item for Tom, then. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jul 24, 2018 at 5:34 PM, Vik Fearing > <vik.fearing@2ndquadrant.com> wrote: >> git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06. > Sounds like an open item for Tom, then. Yeah, agreed, but I've not had time for it yet. regards, tom lane
Vik Fearing <vik.fearing@2ndquadrant.com> writes: > On 24/07/18 22:50, Andrew Gierth wrote: >> Haven't had time yet to poke at what's changed in detail. > git bisect lays the blame on 6719b238e8f0ea83c39d2b1728c7670cdaa34e06. Right. So the reason that this Assert is reachable now is that that commit made it possible to, as the commit message said, expose the values of record-field variables to the planner. All the other places that call paramFetch hooks either believe the returned "prmtype" or do run-time checks that it matches what they think it should be. This one's out of step with a simple assertion, so I think a reasonable fix would be as attached. Eventually, of course, we should make such cases Actually Work by forcing a replan of the cached plan. But it was not the intent of anything I did in v11 to make that happen, only to lay a bit more groundwork for it. So it'd be unreasonable to try to fix that during beta. I was about to add Andrew's example as a test case (also shown in attached), but realized that there's a problem: just as noted in the similar test for named-composite-type changes a bit above there, the failure fails to fail in CLOBBER_CACHE_ALWAYS builds. We'd decided to just omit that test (cf feb1cc559) but having been embarrassed by this crash I'm feeling like we really could do with some test coverage. I'm tempted to extract the affected test cases into their own small test script and provide two variant expected files, one for normal and one for CLOBBER_CACHE_ALWAYS builds. Thoughts? regards, tom lane diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c index 505ae0a..a04ad6e 100644 *** a/src/backend/optimizer/util/clauses.c --- b/src/backend/optimizer/util/clauses.c *************** eval_const_expressions_mutator(Node *nod *** 2567,2573 **** else prm = ¶mLI->params[param->paramid - 1]; ! if (OidIsValid(prm->ptype)) { /* OK to substitute parameter value? */ if (context->estimate || --- 2567,2580 ---- else prm = ¶mLI->params[param->paramid - 1]; ! /* ! * We don't just check OidIsValid, but insist that the ! * fetched type match the Param, just in case the hook did ! * something unexpected. No need to throw an error here ! * though; leave that for runtime. ! */ ! if (OidIsValid(prm->ptype) && ! prm->ptype == param->paramtype) { /* OK to substitute parameter value? */ if (context->estimate || *************** eval_const_expressions_mutator(Node *nod *** 2583,2589 **** bool typByVal; Datum pval; - Assert(prm->ptype == param->paramtype); get_typlenbyval(param->paramtype, &typLen, &typByVal); if (prm->isnull || typByVal) --- 2590,2595 ---- diff --git a/src/pl/plpgsql/src/expected/plpgsql_record.out b/src/pl/plpgsql/src/expected/plpgsql_record.out index 6ea88b3..b5129b3 100644 *** a/src/pl/plpgsql/src/expected/plpgsql_record.out --- b/src/pl/plpgsql/src/expected/plpgsql_record.out *************** alter table mutable drop column f3; *** 459,464 **** --- 459,487 ---- select getf3(null::mutable); -- fails again ERROR: record "x" has no field "f3" \set SHOW_CONTEXT errors + -- check behavior with changes in a record rowtype + create function show_result_type(text) returns text language plpgsql as + $$ + declare + r record; + t text; + begin + execute $1 into r; + select pg_typeof(r.a) into t; + return format('type %s value %s', t, r.a::text); + end; + $$; + select show_result_type('select 1 as a'); + show_result_type + ---------------------- + type integer value 1 + (1 row) + + -- currently this fails, perhaps someday we can make it succeed + select show_result_type('select 2.0 as a'); + ERROR: type of parameter 5 (numeric) does not match that when preparing the plan (integer) + CONTEXT: SQL statement "select pg_typeof(r.a)" + PL/pgSQL function show_result_type(text) line 7 at SQL statement -- check access to system columns in a record variable create function sillytrig() returns trigger language plpgsql as $$begin diff --git a/src/pl/plpgsql/src/sql/plpgsql_record.sql b/src/pl/plpgsql/src/sql/plpgsql_record.sql index aba6887..6f638d6 100644 *** a/src/pl/plpgsql/src/sql/plpgsql_record.sql --- b/src/pl/plpgsql/src/sql/plpgsql_record.sql *************** alter table mutable drop column f3; *** 297,302 **** --- 297,319 ---- select getf3(null::mutable); -- fails again \set SHOW_CONTEXT errors + -- check behavior with changes in a record rowtype + create function show_result_type(text) returns text language plpgsql as + $$ + declare + r record; + t text; + begin + execute $1 into r; + select pg_typeof(r.a) into t; + return format('type %s value %s', t, r.a::text); + end; + $$; + + select show_result_type('select 1 as a'); + -- currently this fails, perhaps someday we can make it succeed + select show_result_type('select 2.0 as a'); + -- check access to system columns in a record variable create function sillytrig() returns trigger language plpgsql as
On Thu, Jul 26, 2018 at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I was about to add Andrew's example as a test case (also shown in > attached), but realized that there's a problem: just as noted in > the similar test for named-composite-type changes a bit above there, > the failure fails to fail in CLOBBER_CACHE_ALWAYS builds. Sorry if this is a dumb question, but why does that happen? I thought that the idea was that CLOBBER_CACHE_ALWAYS shouldn't change semantics. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Jul 26, 2018 at 2:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I was about to add Andrew's example as a test case (also shown in >> attached), but realized that there's a problem: just as noted in >> the similar test for named-composite-type changes a bit above there, >> the failure fails to fail in CLOBBER_CACHE_ALWAYS builds. > Sorry if this is a dumb question, but why does that happen? I thought > that the idea was that CLOBBER_CACHE_ALWAYS shouldn't change > semantics. What we need to fix the problem properly is to forcibly replan a plancache entry when the type of one of the query's input parameters changes. If you turn on CCA, plancache entries are forcibly replanned at every use. So, problem goes away (in the most expensive way possible). regards, tom lane
Hi, On 2018-07-26 14:06:16 -0400, Tom Lane wrote: > I was about to add Andrew's example as a test case (also shown in > attached), but realized that there's a problem: just as noted in > the similar test for named-composite-type changes a bit above there, > the failure fails to fail in CLOBBER_CACHE_ALWAYS builds. We'd decided > to just omit that test (cf feb1cc559) but having been embarrassed by > this crash I'm feeling like we really could do with some test coverage. > I'm tempted to extract the affected test cases into their own small > test script and provide two variant expected files, one for normal and > one for CLOBBER_CACHE_ALWAYS builds. Thoughts? Could we perhaps avoid the divergence by prevent a replan, e.g. by using a cursor and starting the execution of the query before changing the type? Otherwise I think I'm ok with having an alternative file, as long as there's a good comment explaining it. -Andres
Andres Freund <andres@anarazel.de> writes: > On 2018-07-26 14:06:16 -0400, Tom Lane wrote: >> I was about to add Andrew's example as a test case (also shown in >> attached), but realized that there's a problem: just as noted in >> the similar test for named-composite-type changes a bit above there, >> the failure fails to fail in CLOBBER_CACHE_ALWAYS builds. We'd decided >> to just omit that test (cf feb1cc559) but having been embarrassed by >> this crash I'm feeling like we really could do with some test coverage. >> I'm tempted to extract the affected test cases into their own small >> test script and provide two variant expected files, one for normal and >> one for CLOBBER_CACHE_ALWAYS builds. Thoughts? > Could we perhaps avoid the divergence by prevent a replan, e.g. by using > a cursor and starting the execution of the query before changing the > type? Otherwise I think I'm ok with having an alternative file, as long > as there's a good comment explaining it. Well, at the moment the point of having a test would be to ensure that the code fails cleanly (without crashing) after a type change. AFAICS your proposal would amount to not really testing the type-change path at all, so I don't think it helps. I was envisioning a file header comment along the lines of -- Cache-behavior-dependent test cases -- -- These tests logically belong in plpgsql_record.sql, and perhaps someday -- can be moved back there. For now, however, their results are different -- between regular and CLOBBER_CACHE_ALWAYS builds, so we must have two -- expected-output files to cover both cases. To minimize the maintenance -- effort resulting from that, this file should contain only tests that -- do have different results under CLOBBER_CACHE_ALWAYS. regards, tom lane