Re: Aggregate function API versus grouping sets - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Aggregate function API versus grouping sets
Date
Msg-id 2240.1404312004@sss.pgh.pa.us
Whole thread Raw
In response to Aggregate function API versus grouping sets  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: Aggregate function API versus grouping sets  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-hackers
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> 1. Assumptions about the relationship between aggcontext and fn_extra

> Tom's version of the ordered-set aggregate code makes the assumption
> that it is safe to store the aggcontext returned by AggCheckCallContext
> in a structure hung off flinfo->fn_extra. This implicitly assumes that
> the function will be called in only one aggcontext, or at least only one
> per flinfo.

> Doing rollup via GroupAggregate by maintaining multiple transition
> values at a time (one per grouping set) means that the transfn is being
> called interleaved for transition values in different contexts. So the
> question becomes: is it wrong for the transition function to assume that
> aggcontext can be cached this way, or is it necessary for the executor
> to use a separate flinfo for each concurrent grouping set?

Hm.  We could probably move gcontext into the per-group data.  I'm not
sure though if there are any other dependencies there on the groups being
evaluated serially.  More generally, I wonder whether user-defined
aggregates are likely to be making assumptions that will be broken by
this.

> 2. AggGetPerAggEContext

> The comment documents this as returning the per-output-tuple econtext,
> and the ordered-set code assumes that the rescan of this context implies
> that the aggcontext is also about to be reset (so cleanup functions can
> be called).

> Since it seems that the cleanup callback is the sole reason for this
> function to exist, is it acceptable to remove any implication that the
> context returned is the overall per-output-tuple one, and simply state
> that it is a context whose cleanup functions are called at the
> appropriate time before aggcontext is reset?

Well, I think the implication is that it's the econtext in which the
result of the aggregation will be used.

Another approach would be to remove AggGetPerAggEContext as such from the
API altogether, and instead offer an interface that says "register an
aggregate cleanup callback", leaving it to the agg/window core code to
figure out which context to hang that on.  I had thought that exposing
this econtext and the per-input-tuple one would provide useful generality,
but maybe we should rethink that.

Obviously, the time grows short to reconsider this API, but it's not
quite too late.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: Audit of logout
Next
From: Robert Haas
Date:
Subject: Re: RLS Design