Re: Statistical aggregate functions are not working with PARTIALaggregation - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Statistical aggregate functions are not working with PARTIALaggregation
Date
Msg-id 20190519183624.tkuph4oj5atd6ee3@alap3.anarazel.de
Whole thread Raw
In response to Re: Statistical aggregate functions are not working with PARTIAL aggregation  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Statistical aggregate functions are not working with PARTIAL aggregation
Re: Statistical aggregate functions are not working with PARTIAL aggregation
List pgsql-hackers
Hi,

On 2019-05-19 20:18:38 +1200, David Rowley wrote:
> On Fri, 17 May 2019 at 15:04, Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2019-05-08 13:06:36 +0900, Kyotaro HORIGUCHI wrote:
> > > In a second look, I seems to me that the right thing to do here
> > > is setting numInputs instaed of numArguments to numTransInputs in
> > > combining step.
> >
> > Yea, to me this just seems a consequence of the wrong
> > numTransInputs. Arguably this is a bug going back to 9.6, where
> > combining aggregates where introduced. It's just that numTransInputs
> > isn't used anywhere for combining aggregates, before 11.
> 
> Isn't it more due to the lack of any aggregates with > 1 arg having a
> combine function?

I'm not sure I follow? regr_count() already was in 9.6? Including a
combine function?

postgres[1490][1]=# SELECT version();
┌──────────────────────────────────────────────────────────────────────────────────────────┐
│                                         version                                          │
├──────────────────────────────────────────────────────────────────────────────────────────┤
│ PostgreSQL 9.6.13 on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-7) 8.3.0, 64-bit │
└──────────────────────────────────────────────────────────────────────────────────────────┘
(1 row)

postgres[1490][1]=# SELECT aggfnoid::regprocedure FROM pg_aggregate pa JOIN pg_proc pptrans ON (pa.aggtransfn =
pptrans.oid)AND pptrans.pronargs > 2 AND aggcombinefn <> 0;
 
┌───────────────────────────────────────────────────┐
│                     aggfnoid                      │
├───────────────────────────────────────────────────┤
│ regr_count(double precision,double precision)     │
│ regr_sxx(double precision,double precision)       │
│ regr_syy(double precision,double precision)       │
│ regr_sxy(double precision,double precision)       │
│ regr_avgx(double precision,double precision)      │
│ regr_avgy(double precision,double precision)      │
│ regr_r2(double precision,double precision)        │
│ regr_slope(double precision,double precision)     │
│ regr_intercept(double precision,double precision) │
│ covar_pop(double precision,double precision)      │
│ covar_samp(double precision,double precision)     │
│ corr(double precision,double precision)           │
└───────────────────────────────────────────────────┘


But it's not an active problem in 9.6, because numTransInputs wasn't
used at all for combine functions: Before c253b722f6 there simply was no
NULL check for strict trans functions, and after that the check was
simply hardcoded for the right offset in fcinfo, as it's done by code
specific to aggsplit combine.

In bf6c614a2f2 that was generalized, so the strictness check was done by
common code doing the strictness checks, based on numTransInputs. But
due to the fact that the relevant fcinfo->isnull[2..] was always
zero-initialized (more or less by accident, by being part of the
AggStatePerTrans struct, which is palloc0'ed), there was no observable
damage, we just checked too many array elements. And then finally in
a9c35cf85ca1f, that broke, because the fcinfo is a) dynamically
allocated without being zeroed b) exactly the right length.


> > While I agree that fixing numTransInputs is the right way, I'm not
> > convinced the way you did it is the right approach. I'm somewhat
> > inclined to think that it's wrong that ExecInitAgg() calls
> > build_pertrans_for_aggref() with a numArguments that's not actually
> > right? Alternatively I think we should just move the numTransInputs
> > computation into the existing branch around DO_AGGSPLIT_COMBINE.
> 
> Yeah, probably we should be passing in the correct arg count for the
> combinefn to build_pertrans_for_aggref(). However, I see that we also
> pass in the inputTypes from the transfn, just we don't use them when
> working with the combinefn.

Not sure what you mean by that "however"?


> You'll notice that I've just hardcoded the numTransArgs to set it to 1
> when we're working with a combinefn.  The combinefn always requires 2
> args of trans type, so this seems pretty valid to me.

> I think Kyotaro's patch setting of numInputs is wrong.

Yea, my proposal was to simply harcode it to 2 in the
DO_AGGSPLIT_COMBINE path.


> Patch attached of how I think we should fix it.

Thanks.



> diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
> index d01fc4f52e..b061162961 100644
> --- a/src/backend/executor/nodeAgg.c
> +++ b/src/backend/executor/nodeAgg.c
> @@ -2522,8 +2522,9 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
>          int            existing_aggno;
>          int            existing_transno;
>          List       *same_input_transnos;
> -        Oid            inputTypes[FUNC_MAX_ARGS];
> +        Oid            transFnInputTypes[FUNC_MAX_ARGS];
>          int            numArguments;
> +        int            numTransFnArgs;
>          int            numDirectArgs;
>          HeapTuple    aggTuple;
>          Form_pg_aggregate aggform;
> @@ -2701,14 +2702,23 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
>           * could be different from the agg's declared input types, when the
>           * agg accepts ANY or a polymorphic type.
>           */
> -        numArguments = get_aggregate_argtypes(aggref, inputTypes);
> +        numTransFnArgs = get_aggregate_argtypes(aggref, transFnInputTypes);

Not sure I understand the distinction you're trying to make with the
variable renaming. The combine function is also a transition function,
no?


>          /* Count the "direct" arguments, if any */
>          numDirectArgs = list_length(aggref->aggdirectargs);
>  
> +        /*
> +         * Combine functions always have a 2 trans state type input params, so
> +         * this is always set to 1 (we don't count the first trans state).
> +         */

Perhaps the parenthetical should instead be something like "to 1 (the
trans type is not counted as an arg, just like with non-combine trans
function)" or similar?


> @@ -2781,7 +2791,7 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
>                                        aggref, transfn_oid, aggtranstype,
>                                        serialfn_oid, deserialfn_oid,
>                                        initValue, initValueIsNull,
> -                                      inputTypes, numArguments);
> +                                      transFnInputTypes, numArguments);

That means we pass in the wrong input types? Seems like it'd be better
to either pass an empty list, or just create the argument list here.


I'm inclined to push a minimal fix now, and then a slightly more evolved
version fo this after beta1.


> diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
> index d4fd657188..bd8b9e8b4f 100644
> --- a/src/test/regress/sql/aggregates.sql
> +++ b/src/test/regress/sql/aggregates.sql
> @@ -963,10 +963,11 @@ SET enable_indexonlyscan = off;
>  
>  -- variance(int4) covers numeric_poly_combine
>  -- sum(int8) covers int8_avg_combine
> +-- regr_cocunt(float8, float8) covers int8inc_float8_float8 and aggregates with > 1 arg

typo...

>  EXPLAIN (COSTS OFF)
> -  SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
> +  SELECT variance(unique1::int4), sum(unique1::int8),regr_count(unique1::float8, unique1::float8) FROM tenk1;
>  
> -SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1;
> +SELECT variance(unique1::int4), sum(unique1::int8),regr_count(unique1::float8, unique1::float8) FROM tenk1;
>  
>  ROLLBACK;

Does this actually cover the bug at issue here? The non-combine case
wasn't broken?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Remove useless associativity/precedence from parsers
Next
From: Tom Lane
Date:
Subject: Re: Emacs vs pg_indent's weird indentation for function declarations