Thread: BUG #16368: Incorrect function inlining in the presence of a window function
BUG #16368: Incorrect function inlining in the presence of a window function
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16368 Logged by: Elvis Pranskevichus Email address: elprans@gmail.com PostgreSQL version: 12.2 Operating system: Gentoo Linux Description: Consider the following function: CREATE OR REPLACE FUNCTION intfmt(input text, fmt text) RETURNS bigint LANGUAGE sql AS $$ SELECT CASE WHEN fmt IS NULL THEN input::bigint ELSE to_number(input, fmt)::bigint END; $$; And the following query: SELECT intfmt('123,456', q.fmt) AS "out" FROM (SELECT ("v" = first_value("v") OVER ()), '999,999' AS "fmt" FROM (SELECT 1 AS "v") AS "q2" ) AS "q"; The expected result is the integer 123456, but the query fails with: ERROR: invalid input syntax for type bigint: "123,456" CONTEXT: SQL function "intfmt" during inlining Which means that somehow during inlining of "intfmt" Postgres incorrectly takes the first branch in the `CASE` expression. This only happens in the presence of the "first_value" window call in the nested query. Elvis
Re: BUG #16368: Incorrect function inlining in the presence of awindow function
From
"David G. Johnston"
Date:
On Wed, Apr 15, 2020 at 11:07 AM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 16368
Logged by: Elvis Pranskevichus
Email address: elprans@gmail.com
PostgreSQL version: 12.2
Operating system: Gentoo Linux
Description:
Consider the following function:
CREATE OR REPLACE FUNCTION intfmt(input text, fmt text)
[...]
SELECT
CASE WHEN fmt IS NULL
THEN input::bigint
ELSE to_number(input, fmt)::bigint
END;
[...]
SELECT
[...]
intfmt('123,456', q.fmt) AS "out"
The expected result is the integer 123456, but the query fails with:
ERROR: invalid input syntax for type bigint: "123,456"
CONTEXT: SQL function "intfmt" during inlining
Which means that somehow during inlining of "intfmt" Postgres incorrectly
takes the first branch in the `CASE` expression.
During inlining the case expression becomes:
CASE WHEN q.fmt IS NULL
THEN '123,456'::bigint
ELSE to_number('123,456', q.fmt)
END;
It doesn't "take" a branch - it turns variables into constants and, as written, some of those constants are invalid for the types they are being assigned to.
This only happens in the
presence of the "first_value" window call in the nested query.
The ability to optimize, and how, depends on the whole query.
I don't actually know whether this is a bug or just an expected downside to using inline-able functions and case statements to avoid malformed data parsing.
Writing the function in pl/pgsql prevents the inlining and stabilizes the query.
David J.
Re: BUG #16368: Incorrect function inlining in the presence of a window function
From
Elvis Pranskevichus
Date:
On Wednesday, April 15, 2020 11:32:54 A.M. PDT David G. Johnston wrote: > During inlining the case expression becomes: > > CASE WHEN q.fmt IS NULL > THEN '123,456'::bigint > ELSE to_number('123,456', q.fmt) > END; > > It doesn't "take" a branch - it turns variables into constants and, as > written, some of those constants are invalid for the types they are > being assigned to. > > > This only happens in the > > presence of the "first_value" window call in the nested query. > > The ability to optimize, and how, depends on the whole query. > > I don't actually know whether this is a bug or just an expected > downside to using inline-able functions and case statements to avoid > malformed data parsing. > > Writing the function in pl/pgsql prevents the inlining and stabilizes > the query. IMO, an optimization that leads to wrong query results is unquestionably a bug. And "use pl/pgsql" (which is much slower) is arguably not the best answer here. > inline-able functions and case statements to avoid > malformed data parsing. Consider any other case where an error is guarded by a "CASE WHEN", such as division by zero. I think the use of "CASE WHEN" should disqualify the function from being inlined, or, maybe, constant folding should be disabled in the branches of "CASE WHEN" when inlining and when the optimizer is unable to reason about the "CASE" condition. Thoughts? Elvis
Re: BUG #16368: Incorrect function inlining in the presence of awindow function
From
"David G. Johnston"
Date:
On Wed, Apr 15, 2020 at 1:08 PM Elvis Pranskevichus <elprans@gmail.com> wrote:
On Wednesday, April 15, 2020 11:32:54 A.M. PDT David G. Johnston wrote:
> During inlining the case expression becomes:
>
> CASE WHEN q.fmt IS NULL
> THEN '123,456'::bigint
> ELSE to_number('123,456', q.fmt)
> END;
>
> It doesn't "take" a branch - it turns variables into constants and, as
> written, some of those constants are invalid for the types they are
> being assigned to.
>
> > This only happens in the
> > presence of the "first_value" window call in the nested query.
>
> The ability to optimize, and how, depends on the whole query.
>
> I don't actually know whether this is a bug or just an expected
> downside to using inline-able functions and case statements to avoid
> malformed data parsing.
>
> Writing the function in pl/pgsql prevents the inlining and stabilizes
> the query.
IMO, an optimization that leads to wrong query results is unquestionably
a bug. And "use pl/pgsql" (which is much slower) is arguably not the
best answer here.
Maybe not, but if you want something that works it is a solution.
> inline-able functions and case statements to avoid
> malformed data parsing.
Consider any other case where an error is guarded by a "CASE WHEN", such
as division by zero.
There aren't all that many and throwing out a perfectly good optimization for boundary cases that will error, as opposed to returning but including invalid results, isn't that desirable either.
I think the use of "CASE WHEN" should disqualify
the function from being inlined, or, maybe, constant folding should be
disabled in the branches of "CASE WHEN" when inlining and when the
optimizer is unable to reason about the "CASE" condition.
Thoughts?
Outside my wheelhouse as to what is practical. I suspect we'll get a better answer on that front in due time.
The world isn't perfect and I sure cannot write a patch to change this behavior and for my money its something I could live with if it was decided that this is the best option at this time.
David J.
Re: BUG #16368: Incorrect function inlining in the presence of a window function
From
Tom Lane
Date:
"David G. Johnston" <david.g.johnston@gmail.com> writes: > On Wed, Apr 15, 2020 at 1:08 PM Elvis Pranskevichus <elprans@gmail.com> > wrote: >> Consider any other case where an error is guarded by a "CASE WHEN", such >> as division by zero. > There aren't all that many and throwing out a perfectly good optimization > for boundary cases that will error, as opposed to returning but including > invalid results, isn't that desirable either. In point of fact, there are many ways in which CASE and related constructs fail to guarantee evaluation order, as noted in https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-EXPRESS-EVAL The particular case mentioned there seems to be about the same as here: constant-folding happens even in CASE arms that will never be reached at runtime. Problems could happen earlier than that, too. It's not hard to imagine cases that wouldn't execute "as desired" unless we didn't even do parse analysis (e.g, subexpression type determination) on a CASE arm until it's reached at runtime. But we're not going to make that sort of thing happen; it's just too much contortion and inefficiency for too little benefit. In particular, people for whom the current implementation works OK would be quite unhappy with the performance impact of de-optimizing CASE that way. regards, tom lane
Re: BUG #16368: Incorrect function inlining in the presence of a window function
From
Elvis Pranskevichus
Date:
On Wednesday, April 15, 2020 5:14:05 P.M. PDT Tom Lane wrote: > In point of fact, there are many ways in which CASE and related > constructs fail to guarantee evaluation order, as noted in > https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-EX > PRESS-EVAL > > The particular case mentioned there seems to be about the same as > here: constant-folding happens even in CASE arms that will never be > reached at runtime. Yes, but function arguments aren't constants are they? At least the documentation makes no effort to mention that. > would be quite unhappy with the performance impact of de-optimizing > CASE that way. I'm not arguing for the general de-optimization for CASE, just for not treating arguments of inlined functions as constants in the CASE statement. For arguments of a prepared statement this optimization makes even less sense. Elvis
Re: BUG #16368: Incorrect function inlining in the presence of a window function
From
Elvis Pranskevichus
Date:
> The particular case mentioned there seems to be about the same as > here: constant-folding happens even in CASE arms that will never be > reached at runtime. Since the actual issue here is false positive exceptions, why can't we treat errors raised during const eval as a signal to cancel the constant folding, i.e. ignore them and keep the failing expression unoptimized? Elvis
Re: BUG #16368: Incorrect function inlining in the presence of awindow function
From
Kyotaro Horiguchi
Date:
At Wed, 15 Apr 2020 18:17:30 -0700, Elvis Pranskevichus <elprans@gmail.com> wrote in > On Wednesday, April 15, 2020 5:14:05 P.M. PDT Tom Lane wrote: > > > In point of fact, there are many ways in which CASE and related > > constructs fail to guarantee evaluation order, as noted in > > https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-EX > > PRESS-EVAL > > > > The particular case mentioned there seems to be about the same as > > here: constant-folding happens even in CASE arms that will never be > > reached at runtime. > > Yes, but function arguments aren't constants are they? At least the > documentation makes no effort to mention that. > > > would be quite unhappy with the performance impact of de-optimizing > > CASE that way. > > I'm not arguing for the general de-optimization for CASE, just for not > treating arguments of inlined functions as constants in the CASE > statement. For arguments of a prepared statement this optimization > makes even less sense. Perhaps there's seems to be no nice way to detect the case where an arm in a case expression cannot even be executed although the expression is seemingly constant-foldable. (Note that all arms of the case has a chance to be reached since the case-expression is NOT a constant at the parse time.) Isn't there any workaround usable for you? I'm not sure what is your real issue, but it seems to me unnatural that the first argument of intfmt is a literal string. For example, the following query works. SELECT intfmt(q.data, q.fmt) AS "out" FROM (SELECT ("v" = first_value("v") OVER ()), '123,456' AS data, '999,999' AS fmt FROM (SELECT 1 AS "v") AS "q2" ) AS "q"; regards. -- Kyotaro Horiguchi NTT Open Source Software Center