Home > mailing lists

Re: Sharing aggregate states between different aggregate functions - Mailing list pgsql-hackers

From	David Rowley
Subject	Re: Sharing aggregate states between different aggregate functions
Date	July 27, 2015 05:34:18
Msg-id	CAKJS1f-36W-3ODF-0pa2g_hmQVJH5Or-M-o9WPA0DqFp0+UBSw@mail.gmail.com Whole thread
In response to	Re: Sharing aggregate states between different aggregate functions (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses	Re: Sharing aggregate states between different aggregate functions
List	pgsql-hackers

Tree view

On 27 July 2015 at 03:24, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 07/09/2015 12:44 PM, David Rowley wrote:
On 15 June 2015 at 12:05, David Rowley <david.rowley@2ndquadrant.com> wrote:

This basically allows an aggregate's state to be shared between other
aggregate functions when both aggregate's transition functions (and a few
other things) match
There's quite a number of aggregates in our standard set which will
benefit from this optimisation.

After compiling the original patch with another compiler, I noticed a
couple of warnings.

The attached fixes these.

I spent some time reviewing this. I refactored the ExecInitAgg code rather heavily to make it more readable (IMHO); see attached. What do you think? Did I break anything?

Thanks for taking the time to look at this and makes these fixes.

I'm just looking over your changes:

- ereport(ERROR,

- (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),

- errmsg("aggregate %u needs to have compatible input type and transition type",

- aggref->aggfnoid)));

+ ereport(ERROR,

+ (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),

+ errmsg("aggregate needs to have compatible input type and transition type")));

I can't quite see the reason to remove the agg OID from the error message here. It seems to be still valid to use as build_peraggstate_for_aggref() only is called when nothing is shared.

- * agg_input_types, agg_state_type, agg_result_type identify the input,

- * transition, and result types of the aggregate. These should all be

- * resolved to actual types (ie, none should ever be ANYELEMENT etc).

+ * agg_input_types identifies the input types of the aggregate. These should

+ * be resolved to actual types (ie, none should ever be ANYELEMENT etc).

I'm not sure I understand why agg_state_type and agg_result_type have changed here.

+ peraggstate->sortstates = (Tuplesortstate **)

+ palloc0(sizeof(Tuplesortstate *) * numGroupingSets);

+ for (currentsortno = 0; currentsortno < numGroupingSets; currentsortno++)

+ peraggstate->sortstates[currentsortno] = NULL;

This was not you, but this NULL setting looks unneeded due to the palloc0().

Some comments:

* In aggref_has_compatible_states(), you give up if aggtype or aggcollid differ. But those properties apply to the final function, so you were leaving some money on the table by disallowing state-sharing if they differ.

Good catch, and accurate analogy. Thanks for fixing.

* The filter and input expressions are initialized for every AggRef, before the deduplication logic kicks in. The AggrefExprState.aggfilter, aggdirectargs and args fields really belong to the AggStatePerAggState struct instead. This is not a new issue, but now that we have a convenient per-aggstate struct to put them in, let's do so.

Good idea. I failed to notice that code over there in execQual.c so I agree that where you've moved it to is much better.

* There was a reference-after free bug in aggref_has_compatible_states; you cannot ReleaseSysCache and then continue pointing to the struct.

Thanks for fixing.

In this function I also wasn't quite sure if it was with comparing both non-NULL INITCOND's here. I believe my code comments may slightly contradict what the code actually does, as the comments talk about them having to match, but the code just bails if any are non-NULL. The reason I didn't check them was because it seems inevitable that some duplicate work needs to be done when setting up the INITCOND. Perhaps it's worth it?

select aggfnoid || '(' || typname || ')',aggtransfn,agginitval

from pg_aggregate

inner join pg_type on aggtranstype = oid

where aggtransfn in (select aggtransfn

from pg_aggregate

group by aggtransfn

having count(*)>1)

order by aggtransfn;

This indicates that everything using float4_accum as a transfn could benefit from that. I just wasn't sure how far to go.

* The code was a bit fuzzy on which parts of the per-aggstate are filled in at what time. Some of the fields were overwritten every time a match was found. With the same values, so no harm done, but I found it confusing. I refactored ExecInitAgg in the attached patch to clear that up.

* There API of build_aggregate_fnexprs() was a bit strange now that some callers use it to only fill in the final function, some call it to fill both the transition functions and the final function. I split it to two separate functions.

That's much better.

* I wonder if we should do this duplicate elimination at plan time. It's very fast, so I'm not worried about that right now, but you had grand plans to expand this so that an aggregate could optionally use one of many different kinds of state values. At that point, it certainly seems like a planning decision to decide which aggregates share state. I think we can leave it as it is for now, but it's something to perhaps consider later.

I don't think I'm going to get the time to work on the "supporting aggregate" stuff you're talking about, but I think it's a good enough idea to keep around for the future, so I think this shared aggregate states stuff probably should go into nodeAgg.c for now. I have to say though, I was a little surprised to find this code in the executor rather than the planner when I first started on this.

BTW, the name of the AggStatePerAggStateData struct is pretty horrible. The repeated "AggState" feels awkward. Now that I've stared at the patch for a some time, it doesn't bother me anymore, but it took me quite a while to over that. I'm sure it will for others too. And it's not just that struct, the comments talk about "aggregate state", which could be confused to mean "AggState", but it actually means AggStatePerAggStateData. I don't have any great suggestions, but can you come up a better naming scheme?

I agree, they're horrible. The thing that's causing the biggest problem is the struct named AggState, which carries state for *all* aggregates, and has nothing to do with "transition state", so it seems there's two different meanings if the word "state" and I've used both meanings in the name for AggStatePerAggStateData.

Perhaps just renaming AggStatePerAggStateData to AggStateTransStateData would be good enough?

I've attached a delta patch based on your patch, in this I've:

1. Renamed AggStatePerAggStateData to AggStateTransStateData and all variables using that are renamed to suit better.

2. Removed surplus peraggstate->sortstates[currentsortno] = NULL; (not related to this patch, but since we're moving that part of the code, we'd better fix)

3. Put back the missing aggfnoid from the error message.

4. Changed initialize_aggregates() to not pass the states. They're already in AggState and we're using aggstate->numstates to get the count of the items in that array, so it seems wrong to allow a different array to ever be passed in.

5. Changed wording of a few comments to try and reduce confusing of 'state' and 'transition state'.

6. Renamed AggState.peraggstate to transstates. I pluralised this to try to reduce confusion of the single state pointers named 'transstate' in the functions in nodeAgg.c. I did think that peragg should also become peraggs and pergroup should become pergroups, but didn't change those.

Anything else I changed is self explanatory.

What do you think?

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

Attachment

sharing_aggstate-heikki-1_delta1.patch

pgsql-hackers by date:

From: Piotr Stefaniak
Date: 27 July 2015, 05:34:07
Subject: Re: spgist recovery assertion failure

From: Andreas Seltenreich
Date: 27 July 2015, 06:04:13
Subject: Re: Failing assertions in indxpath.c, placeholder.c and brin_minmax.c

Re: Sharing aggregate states between different aggregate functions - Mailing list pgsql-hackers

Attachment

Previous

Next