Re: Statistical aggregate functions are not working with PARTIAL aggregation - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Statistical aggregate functions are not working with PARTIAL aggregation |
Date | |
Msg-id | CAKJS1f9+d88Hp8nu=UQhb7wjZek+kmT9H4FGcGQnGmrV=JP8xg@mail.gmail.com Whole thread Raw |
In response to | Re: Statistical aggregate functions are not working with PARTIALaggregation (Andres Freund <andres@anarazel.de>) |
Responses |
Re: Statistical aggregate functions are not working with PARTIALaggregation
|
List | pgsql-hackers |
On Mon, 20 May 2019 at 06:36, Andres Freund <andres@anarazel.de> wrote: > > Isn't it more due to the lack of any aggregates with > 1 arg having a > > combine function? > > I'm not sure I follow? regr_count() already was in 9.6? Including a > combine function? Oops, that line I meant to delete before sending. > > Yeah, probably we should be passing in the correct arg count for the > > combinefn to build_pertrans_for_aggref(). However, I see that we also > > pass in the inputTypes from the transfn, just we don't use them when > > working with the combinefn. > > Not sure what you mean by that "however"? Well, previously those two arguments were always for the function in pg_aggregate.aggtransfn. I only changed one of them to mean the trans func that's being used, which detracted slightly from my ambition to change just what numArguments means. > > You'll notice that I've just hardcoded the numTransArgs to set it to 1 > > when we're working with a combinefn. The combinefn always requires 2 > > args of trans type, so this seems pretty valid to me. > > > I think Kyotaro's patch setting of numInputs is wrong. > > Yea, my proposal was to simply harcode it to 2 in the > DO_AGGSPLIT_COMBINE path. ok. > > diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c > > index d01fc4f52e..b061162961 100644 > > --- a/src/backend/executor/nodeAgg.c > > +++ b/src/backend/executor/nodeAgg.c > > @@ -2522,8 +2522,9 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) > > int existing_aggno; > > int existing_transno; > > List *same_input_transnos; > > - Oid inputTypes[FUNC_MAX_ARGS]; > > + Oid transFnInputTypes[FUNC_MAX_ARGS]; > > int numArguments; > > + int numTransFnArgs; > > int numDirectArgs; > > HeapTuple aggTuple; > > Form_pg_aggregate aggform; > > @@ -2701,14 +2702,23 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) > > * could be different from the agg's declared input types, when the > > * agg accepts ANY or a polymorphic type. > > */ > > - numArguments = get_aggregate_argtypes(aggref, inputTypes); > > + numTransFnArgs = get_aggregate_argtypes(aggref, transFnInputTypes); > > Not sure I understand the distinction you're trying to make with the > variable renaming. The combine function is also a transition function, > no? I was trying to make it more clear what each variable is for. It's true that the combine function is used as a transition function in this case, but I'd hoped it would be more easy to understand that the input arguments listed in a variable named transFnInputTypes would be for the function mentioned in pg_aggregate.aggtransfn rather than the transfn we're using. If that's not any more clear then maybe another fix is better, or we can leave it... I had to make sense of all this code last night and I was just having a go at making it easier to follow for the next person who has to. > > /* Count the "direct" arguments, if any */ > > numDirectArgs = list_length(aggref->aggdirectargs); > > > > + /* > > + * Combine functions always have a 2 trans state type input params, so > > + * this is always set to 1 (we don't count the first trans state). > > + */ > > Perhaps the parenthetical should instead be something like "to 1 (the > trans type is not counted as an arg, just like with non-combine trans > function)" or similar? Yeah, that's better. > > > @@ -2781,7 +2791,7 @@ ExecInitAgg(Agg *node, EState *estate, int eflags) > > aggref, transfn_oid, aggtranstype, > > serialfn_oid, deserialfn_oid, > > initValue, initValueIsNull, > > - inputTypes, numArguments); > > + transFnInputTypes, numArguments); > > That means we pass in the wrong input types? Seems like it'd be better > to either pass an empty list, or just create the argument list here. What do you mean "here"? Did you mean to quote this fragment? @@ -2880,7 +2895,7 @@ build_pertrans_for_aggref(AggStatePerTrans pertrans, Oid aggtransfn, Oid aggtranstype, Oid aggserialfn, Oid aggdeserialfn, Datum initValue, bool initValueIsNull, - Oid *inputTypes, int numArguments) + Oid *transFnInputTypes, int numArguments) I had hoped the rename would make it more clear that these are the args for the function in pg_aggregate.aggtransfn. We could pass NULL instead when it's the combine func, but I didn't really see the advantage of it. > I'm inclined to push a minimal fix now, and then a slightly more evolved > version fo this after beta1. Ok > > diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql > > index d4fd657188..bd8b9e8b4f 100644 > > --- a/src/test/regress/sql/aggregates.sql > > +++ b/src/test/regress/sql/aggregates.sql > > @@ -963,10 +963,11 @@ SET enable_indexonlyscan = off; > > > > -- variance(int4) covers numeric_poly_combine > > -- sum(int8) covers int8_avg_combine > > +-- regr_cocunt(float8, float8) covers int8inc_float8_float8 and aggregates with > 1 arg > > typo... oops. I spelt coconut wrong. :) > > > EXPLAIN (COSTS OFF) > > - SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1; > > + SELECT variance(unique1::int4), sum(unique1::int8),regr_count(unique1::float8, unique1::float8) FROM tenk1; > > > > -SELECT variance(unique1::int4), sum(unique1::int8) FROM tenk1; > > +SELECT variance(unique1::int4), sum(unique1::int8),regr_count(unique1::float8, unique1::float8) FROM tenk1; > > > > ROLLBACK; > > Does this actually cover the bug at issue here? The non-combine case > wasn't broken? The EXPLAIN shows the plan is: QUERY PLAN ---------------------------------------------- Finalize Aggregate -> Gather Workers Planned: 4 -> Partial Aggregate -> Parallel Seq Scan on tenk1 (5 rows) so it is exercising the combine functions. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: