Thread: Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
From
Tom Lane
Date:
I kinda forgot about this bug when I went off on vacation: http://www.postgresql.org/message-id/E1UnCv4-0007oF-Bo@wrigleys.postgresql.org The way to fix it seems to be to do static prechecking of whether a function's input arguments can return sets, rather than relying on their behavior during the first execution. We already have the infrastructure needed, namely expression_returns_set(), so a fix can be pretty short, as illustrated in the attached proposed patch. Now one might object that this will add an unwanted amount of overhead to query startup. expression_returns_set() is fairly cheap, since it only requires a parsetree walk and not any catalog lookups, but it's not zero cost. On the other hand, some of that cost is bought back in normal non-set cases by the fact that we needn't go through ExecMakeFunctionResult() even once. I tried to measure whether there was a slowdown using this test case: $ pgbench -c 4 -T 60 -S -n bench in a non-assert build using a scale-factor-10 pgbench database, on an 8-core machine. The best reported transaction rate over five trials was 35556.680918 in current HEAD, and 35466.438468 with the patch, for an apparent slowdown of 0.3%. This is below what I'd normally consider significant, since the run-to-run variability is considerably more than that, but there may be some actual slowdown there. We could consider a more invasive fix that would try to push the static checking back into the planner or even parser, but that would not make things any faster when queries are only executed once after planning, as is true in this test scenario. In any case, adding fields to FuncExpr/OpExpr would not be a back-patchable fix. Thoughts? Unless someone has a better idea, I'm inclined to push this patch and call it good. regards, tom lane diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c index 67dca78..6f45973 100644 *** a/src/backend/executor/execQual.c --- b/src/backend/executor/execQual.c *************** tupledesc_match(TupleDesc dst_tupdesc, T *** 1634,1642 **** * init_fcache is presumed already run on the FuncExprState. * * This function handles the most general case, wherein the function or ! * one of its arguments might (or might not) return a set. If we find ! * no sets involved, we will change the FuncExprState's function pointer ! * to use a simpler method on subsequent calls. */ static Datum ExecMakeFunctionResult(FuncExprState *fcache, --- 1634,1640 ---- * init_fcache is presumed already run on the FuncExprState. * * This function handles the most general case, wherein the function or ! * one of its arguments can return a set. */ static Datum ExecMakeFunctionResult(FuncExprState *fcache, *************** restart: *** 1906,1918 **** /* * Non-set case: much easier. * ! * We change the ExprState function pointer to use the simpler ! * ExecMakeFunctionResultNoSets on subsequent calls. This amounts to ! * assuming that no argument can return a set if it didn't do so the ! * first time. */ - fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResultNoSets; - if (isDone) *isDone = ExprSingleResult; --- 1904,1915 ---- /* * Non-set case: much easier. * ! * In common cases, this code path is unreachable because we'd have ! * selected ExecMakeFunctionResultNoSets instead. However, it's ! * possible to get here if an argument sometimes produces set results ! * and sometimes scalar results. For example, a CASE expression might ! * call a set-returning function in only some of its arms. */ if (isDone) *isDone = ExprSingleResult; *************** ExecEvalFunc(FuncExprState *fcache, *** 2371,2380 **** init_fcache(func->funcid, func->inputcollid, fcache, econtext->ecxt_per_query_memory, true); ! /* Go directly to ExecMakeFunctionResult on subsequent uses */ ! fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult; ! ! return ExecMakeFunctionResult(fcache, econtext, isNull, isDone); } /* ---------------------------------------------------------------- --- 2368,2389 ---- init_fcache(func->funcid, func->inputcollid, fcache, econtext->ecxt_per_query_memory, true); ! /* ! * We need to invoke ExecMakeFunctionResult if either the function itself ! * or any of its input expressions can return a set. Otherwise, invoke ! * ExecMakeFunctionResultNoSets. In either case, change the evalfunc ! * pointer to go directly there on subsequent uses. ! */ ! if (fcache->func.fn_retset || expression_returns_set((Node *) func->args)) ! { ! fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult; ! return ExecMakeFunctionResult(fcache, econtext, isNull, isDone); ! } ! else ! { ! fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResultNoSets; ! return ExecMakeFunctionResultNoSets(fcache, econtext, isNull, isDone); ! } } /* ---------------------------------------------------------------- *************** ExecEvalOper(FuncExprState *fcache, *** 2394,2403 **** init_fcache(op->opfuncid, op->inputcollid, fcache, econtext->ecxt_per_query_memory, true); ! /* Go directly to ExecMakeFunctionResult on subsequent uses */ ! fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult; ! ! return ExecMakeFunctionResult(fcache, econtext, isNull, isDone); } /* ---------------------------------------------------------------- --- 2403,2424 ---- init_fcache(op->opfuncid, op->inputcollid, fcache, econtext->ecxt_per_query_memory, true); ! /* ! * We need to invoke ExecMakeFunctionResult if either the function itself ! * or any of its input expressions can return a set. Otherwise, invoke ! * ExecMakeFunctionResultNoSets. In either case, change the evalfunc ! * pointer to go directly there on subsequent uses. ! */ ! if (fcache->func.fn_retset || expression_returns_set((Node *) op->args)) ! { ! fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResult; ! return ExecMakeFunctionResult(fcache, econtext, isNull, isDone); ! } ! else ! { ! fcache->xprstate.evalfunc = (ExprStateEvalFunc) ExecMakeFunctionResultNoSets; ! return ExecMakeFunctionResultNoSets(fcache, econtext, isNull, isDone); ! } } /* ---------------------------------------------------------------- diff --git a/src/test/regress/expected/rangefuncs.out b/src/test/regress/expected/rangefuncs.out index a988dd0..9d40510 100644 *** a/src/test/regress/expected/rangefuncs.out --- b/src/test/regress/expected/rangefuncs.out *************** select * from foobar(); -- fail *** 1992,1994 **** --- 1992,2008 ---- ERROR: function return row and query-specified return row do not match DETAIL: Returned row contains 3 attributes, but query expects 2. drop function foobar(); + -- check behavior when a function's input sometimes returns a set (bug #8228) + SELECT *, + lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1] + ELSE str + END) + FROM + (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str); + id | str | lower + ----+------------------+------------------ + 1 | | + 2 | 0000000049404 | 49404 + 3 | FROM 10000000876 | from 10000000876 + (3 rows) + diff --git a/src/test/regress/sql/rangefuncs.sql b/src/test/regress/sql/rangefuncs.sql index ac2769f..07d2e1a 100644 *** a/src/test/regress/sql/rangefuncs.sql --- b/src/test/regress/sql/rangefuncs.sql *************** $$ select (1, 2.1, 3) $$ language sql; *** 599,601 **** --- 599,610 ---- select * from foobar(); -- fail drop function foobar(); + + -- check behavior when a function's input sometimes returns a set (bug #8228) + + SELECT *, + lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1] + ELSE str + END) + FROM + (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str);
Re: Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
From
David Johnston
Date:
Tom Lane-2 wrote > I kinda forgot about this bug when I went off on vacation: > http://www.postgresql.org/message-id/ > E1UnCv4-0007oF-Bo@.postgresql Just to clarify: This patch will cause both executions of the example query to fail with the "set-valued function..." error. Also, the reason the "::varchar" one did not fail was because not cast function was ever called but the "::varchar(30)" forced a function call and thus prompted the error when the second record and resultant regexp_matches expression was encountered. Thanks! David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785629.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Re: Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
From
David Johnston
Date:
David Johnston wrote > > Tom Lane-2 wrote >> I kinda forgot about this bug when I went off on vacation: >> http://www.postgresql.org/message-id/ >> E1UnCv4-0007oF-Bo@.postgresql > > Just to clarify: > > This patch will cause both executions of the example query to fail with > the "set-valued function..." error. > > Also, the reason the "::varchar" one did not fail was because not cast > function was ever called but the "::varchar(30)" forced a function call > and thus prompted the error when the second record and resultant > regexp_matches expression was encountered. > > Thanks! > > David J. Sorry for the imprecise English. I believe both of the following items but would like confirmation/clarification of my understanding. The whole "varchar/varchar(30)" discrepancy is bothersome and since the example forces a function-call via the use of "lower(...)", and doesn't test the non-function situation, I am concerned this patch is incorrect. If the first item is not true, i.e. this patch makes both alternatives work, then I think the wrong "solution" was chosen - or at least not fully vetted. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785631.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Re: Re: Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
From
Tom Lane
Date:
David Johnston <polobo@yahoo.com> writes: > The whole "varchar/varchar(30)" discrepancy is bothersome and since the > example forces a function-call via the use of "lower(...)", and doesn't test > the non-function situation, I am concerned this patch is incorrect. The reason casting to varchar(30) fails is that that results in invocation of a function (to enforce the length limit). Casting to varchar is just a RelabelType operation, which doesn't have two different code paths for set and not-set inputs. I did check the patch against your original example, but I thought using lower() made the purpose of the regression test case more apparent. regards, tom lane
Re: Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
From
David Johnston
Date:
Tom Lane-2 wrote > David Johnston < > polobo@ > > writes: >> The whole "varchar/varchar(30)" discrepancy is bothersome and since the >> example forces a function-call via the use of "lower(...)", and doesn't >> test >> the non-function situation, I am concerned this patch is incorrect. > > The reason casting to varchar(30) fails is that that results in invocation > of a function (to enforce the length limit). Casting to varchar is just a > RelabelType operation, which doesn't have two different code paths for > set and not-set inputs. > > I did check the patch against your original example, but I thought using > lower() made the purpose of the regression test case more apparent. Yeah, I caught that part. My focus was on the non-function version. Not being able to apply the patch and test myself it sounds like you likely made the function-invocation version succeed along with the original re-label-only version. I guess backward-compatibility concerns forces this solution but after thinking through what was happening I was leaning more toward making both queries fail. An SRF in a CASE expression seems like a foot-gun to me. SRF in the select-list is someone of a foot-gun too but understandable given the recency of the addition of LATERAL to our toolbox. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Fixing-bug-8228-set-valued-function-called-in-context-that-cannot-accept-a-set-tp5785622p5785636.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
Re: Re: Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
From
Tom Lane
Date:
David Johnston <polobo@yahoo.com> writes: > I guess backward-compatibility concerns forces this solution but after > thinking through what was happening I was leaning more toward making both > queries fail. An SRF in a CASE expression seems like a foot-gun to me. Not really. The cause of the problem in your example is not that the CASE contains a SRF, but that one arm of the CASE returns a set and the other does not. There is maybe some argument for making that situation fail --- essentially, making SETOF be a core part of expression type identification, such that you can't unify setof text with plain text. But given that that works in most cases, and was only broken by a performance optimization added to FuncExpr evaluation (by me :-(), I think removing the unwarranted assumption from that optimization is the right thing. I'd definitely not wish to break cases that work as expected today, but being stricter about expression types would do that. regards, tom lane