Thread: Wrong aggregate result when sorting by a NULL value
Dear PostgreSQLers, the following seems as a bug to me on Postgres 11.0: CREATE FUNCTION first_arg(ANYELEMENT, ANYELEMENT) RETURNS ANYELEMENT AS $function$ SELECT $1 $function$ LANGUAGE SQL IMMUTABLE STRICT; CREATE AGGREGATE first(ANYELEMENT) ( SFUNC = first_arg, STYPE = ANYELEMENT ); CREATE TABLE t ( x TEXT, y INT, z DATE ); INSERT INTO t (x, y, z) VALUES ('val', 42, NULL); SELECT first(x ORDER BY y) FROM t; -- returns 'val', as expected SELECT first(x ORDER BY y, z) FROM t; -- returns NULL, which seems wrong I would expect both the SELECT statements to return 'val'. Additional order by "z" should make no difference as there is just one row in the table. More interestingly, if "z" is not NULL, the result is correct: UPDATE t SET z = CURRENT_DATE; SELECT first(x ORDER BY y) FROM t; -- returns 'val' SELECT first(x ORDER BY y, z) FROM t; -- returns 'val' The documentation [https://www.postgresql.org/docs/11/static/xaggr.html] says that if the state function is STRICT, the first non-NULL value is automatically used as the initial state. The ORDER BY option is not documented to have any effect on this - the documentation just says that "[DISTINCT and ORDER BY] options are implemented behind the scenes and are not the concern of the aggregate's support functions." Do I miss something, or is it really a bug? Best regards, Ondrej Bouda
=?UTF-8?Q?Ond=c5=99ej_Bouda?= <obouda@email.cz> writes: > the following seems as a bug to me on Postgres 11.0: Yeah, somebody broke this between 10.x and 11.0. You don't need the custom aggregate, even plain min() fails: regression=# SELECT min(x ORDER BY z) FROM t; min ----- (1 row) while 10.5 delivers the expected result: regression=# SELECT min(x ORDER BY z) FROM t; min ----- val (1 row) I've not looked at the code yet, but it's acting like somebody changed the STRICT logic from "are any of the aggregate's arguments null" to "is any part of the whole row (including ordering values) null". Wrong ... regards, tom lane
I wrote: > =?UTF-8?Q?Ond=c5=99ej_Bouda?= <obouda@email.cz> writes: >> the following seems as a bug to me on Postgres 11.0: > Yeah, somebody broke this between 10.x and 11.0. ... > I've not looked at the code yet, but it's acting like somebody changed the > STRICT logic from "are any of the aggregate's arguments null" to "is any > part of the whole row (including ordering values) null". Wrong ... git bisect fingers this: commit 69c3936a1499b772a749ae629fc59b2d72722332 Author: Andres Freund <andres@anarazel.de> Date: Tue Jan 9 13:25:38 2018 -0800 Expression evaluation based aggregate transition invocation. Andres, do you have time to look at this right now? regards, tom lane
Hi, On 2018-11-02 11:34:46 -0400, Tom Lane wrote: > I wrote: > > =?UTF-8?Q?Ond=c5=99ej_Bouda?= <obouda@email.cz> writes: > >> the following seems as a bug to me on Postgres 11.0: > > > Yeah, somebody broke this between 10.x and 11.0. ... > > I've not looked at the code yet, but it's acting like somebody changed the > > STRICT logic from "are any of the aggregate's arguments null" to "is any > > part of the whole row (including ordering values) null". Wrong ... > > git bisect fingers this: > > commit 69c3936a1499b772a749ae629fc59b2d72722332 > Author: Andres Freund <andres@anarazel.de> > Date: Tue Jan 9 13:25:38 2018 -0800 > > Expression evaluation based aggregate transition invocation. > > Andres, do you have time to look at this right now? Thanks for bisecting. I'll take a look later today. Greetings, Andres Freund
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: >> Andres, do you have time to look at this right now? Andres> Thanks for bisecting. I'll take a look later today. Looks like this: + scratch.d.agg_strict_input_check.nargs = numInputs; should have been pertrans->numTransInputs instead? -- Andrew (irc:RhodiumToad)
Hi, On 2018-11-02 17:34:43 +0000, Andrew Gierth wrote: > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: > > >> Andres, do you have time to look at this right now? > > Andres> Thanks for bisecting. I'll take a look later today. > > Looks like this: > > + scratch.d.agg_strict_input_check.nargs = numInputs; > > should have been pertrans->numTransInputs instead? That looks like it's precisely the reason. I'll push something tomorrow PST morning, crediting both you (diagnosis) and Tom (bisecting) obviously, unless you prefer to do so yourself. It's a bit sad that we don't have any tests that test this :/. I'll add something for the specific case, but that'll obviously not be exhaustive. Greetings, Andres Freund
Hi, On 2018-11-02 22:57:00 -0700, Andres Freund wrote: > On 2018-11-02 17:34:43 +0000, Andrew Gierth wrote: > > >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes: > > > > >> Andres, do you have time to look at this right now? > > > > Andres> Thanks for bisecting. I'll take a look later today. > > > > Looks like this: > > > > + scratch.d.agg_strict_input_check.nargs = numInputs; > > > > should have been pertrans->numTransInputs instead? > > That looks like it's precisely the reason. I'll push something tomorrow > PST morning, crediting both you (diagnosis) and Tom (bisecting) > obviously, unless you prefer to do so yourself. > > It's a bit sad that we don't have any tests that test this :/. I'll add > something for the specific case, but that'll obviously not be > exhaustive. And pushed. Thanks Ondřej for the report, thanks Tom & Andrew for identifying the issue. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > And pushed. Thanks Ondřej for the report, thanks Tom & Andrew for > identifying the issue. Hm, buildfarm seems less than pleased. Did you miss making a corresponding change in the JIT code? regards, tom lane
On 2018-11-03 18:18:40 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > And pushed. Thanks Ondřej for the report, thanks Tom & Andrew for > > identifying the issue. > > Hm, buildfarm seems less than pleased. Did you miss making a > corresponding change in the JIT code? Hm, I'm somewhat confused, let me look into that. The JIT code shouldn't really need to be changed here - it's the *generation* of expression steps that's going wrong - which then later get turned into JITed code, but that part worked previously for other expressions. Greetings, Andres Freund
Hi, On 2018-11-03 15:27:09 -0700, Andres Freund wrote: > On 2018-11-03 18:18:40 -0400, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > > > And pushed. Thanks Ondřej for the report, thanks Tom & Andrew for > > > identifying the issue. > > > > Hm, buildfarm seems less than pleased. Did you miss making a > > corresponding change in the JIT code? > > Hm, I'm somewhat confused, let me look into that. The JIT code shouldn't > really need to be changed here - it's the *generation* of expression > steps that's going wrong - which then later get turned into JITed code, > but that part worked previously for other expressions. Turns out it's not a great idea to generate EEOP_AGG_STRICT_INPUT_CHECK expressions with nargs = 0. Head -> Desk. Pushed a fix (+ new assertion). Thanks! Andres Freund
On Sat, Nov 3, 2018 at 7:08 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hm, I'm somewhat confused, let me look into that. The JIT code shouldn't
> really need to be changed here - it's the *generation* of expression
> steps that's going wrong - which then later get turned into JITed code,
> but that part worked previously for other expressions.
Turns out it's not a great idea to generate EEOP_AGG_STRICT_INPUT_CHECK
expressions with nargs = 0. Head -> Desk. Pushed a fix (+ new
assertion).
I'm now getting a compiler warning:
execExpr.c: In function 'ExecBuildAggTrans':
execExpr.c:2864:7: warning: unused variable 'numInputs' [-Wunused-variable]
int numInputs = pertrans->numInputs;
Cheers,
Jeff
Jeff Janes <jeff.janes@gmail.com> writes: > I'm now getting a compiler warning: > execExpr.c: In function 'ExecBuildAggTrans': > execExpr.c:2864:7: warning: unused variable 'numInputs' [-Wunused-variable] > int numInputs = pertrans->numInputs; Used-for-asserts-only problem. Will fix. regards, tom lane
On 2018-11-04 11:19:59 -0500, Tom Lane wrote: > Jeff Janes <jeff.janes@gmail.com> writes: > > I'm now getting a compiler warning: > > > execExpr.c: In function 'ExecBuildAggTrans': > > execExpr.c:2864:7: warning: unused variable 'numInputs' [-Wunused-variable] > > int numInputs = pertrans->numInputs; > > Used-for-asserts-only problem. Will fix. Thanks. I wonder if we shouldn't turn Asserts() into something roughly akin to if (0) {expr};. That way we'd not deal with errors about unused variables anymore - we're not safe against unreachable code warnings anyway. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-11-04 11:19:59 -0500, Tom Lane wrote: >> Used-for-asserts-only problem. Will fix. > Thanks. I wonder if we shouldn't turn Asserts() into something roughly > akin to if (0) {expr};. That way we'd not deal with errors about unused > variables anymore - we're not safe against unreachable code warnings > anyway. Meh. I'm unexcited about getting rid of one type of compiler warning by introducing another one. regards, tom lane
On 2018-11-04 18:49:40 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2018-11-04 11:19:59 -0500, Tom Lane wrote: > >> Used-for-asserts-only problem. Will fix. > > > Thanks. I wonder if we shouldn't turn Asserts() into something roughly > > akin to if (0) {expr};. That way we'd not deal with errors about unused > > variables anymore - we're not safe against unreachable code warnings > > anyway. > > Meh. I'm unexcited about getting rid of one type of compiler warning by > introducing another one. It's impracticable to enable dead code warnings in postgres anyway. There's way way too many of them (all the returns after elog(ERROR) etc are enough to make it infeasible to change that). Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-11-04 18:49:40 -0500, Tom Lane wrote: >> Meh. I'm unexcited about getting rid of one type of compiler warning by >> introducing another one. > It's impracticable to enable dead code warnings in postgres > anyway. There's way way too many of them (all the returns after > elog(ERROR) etc are enough to make it infeasible to change that). I think you have a very limited conception of what sorts of warnings code like that would trigger. In particular, even given the unsubstantiated assumption that nobody's compiler would complain, I think static analyzers like Coverity might. regards, tom lane