Re: array_accum aggregate - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: array_accum aggregate
Date
Msg-id 20061006220520.GK24675@kenobi.snowman.net
Whole thread Raw
In response to Re: array_accum aggregate  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: array_accum aggregate  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> >   Long story short, I set out to build a faster array_accum.  Much to my
> >   suprise and delight, we already *had* one.  accumArrayResult() and
> >   makeArrayResult()/construct_md_array() appear to do a fantastic job.
> >   I've created a couple of 'glue' functions to expose these functions so
> >   they can be used in an aggregate.  I'm sure they could be improved
> >   upon and possibly made even smaller than they already are (90 lines
> >   total for both) but I'd like to throw out the idea of including them
> >   in core.  The aggregate created with them could also be considered for
> >   inclusion though I'm less concerned with that.
>
> Since you've set up the functions to only be usable inside an aggregate,
> I don't see much of a reason why we wouldn't provide the aggregate too.

Sure, I don't see a reason these functions would be useful outside of an
aggregate in the form they need to be in for the aggregate, either..

> It looks like it should work to have just one polymorphic aggregate
> definition, eg, array_accum(anyelement) returns anyarray.

I was hoping to do that, but since it's an aggregate the ffunc format is
pre-defined to require accepting the 'internal state' and nothing else,
and to return 'anyelement' or 'anyarray' one of the inputs must be an
'anyelement' or 'anyarray', aiui.  That also implied to me that the type
passed in was expected to be the type passed out, which wouldn't
necessairly be the case here as the internal state variable is a bytea.
It might be possible to do away with the internal state variable
entirely though and make it an anyelement instead, if we can find
somewhere else to put the pointer to the ArrayBuildState.

> As far as coding style goes, you're supposed to use makeArrayResult()
> with accumArrayResult(), not call construct_md_array() directly.  And
> copying the ArrayBuildState around like that is just plain bizarre...

I had attempted to use makeArrayResult() originally and ran into some
trouble with the MemoryContextDelete() which is done during it.  The
context used was the AggState context and therefore was deleted
elsewhere.  That might have been a misunderstanding on my part though
since I recall fixing at least one or two bugs afterwards, so it may be
possible to go back and change to using makeArrayResult().  That'd
certainly make the ffunc smaller.

As for ArrayBuildState, I'm not actually copying the structure around,
just the pointer is being copied around inside of the state transistion
variable (which is a bytea).  I'm not sure where else to store the
pointer to the ArrayBuildState though, since multiple could be in play
at a given time during an aggregation, aiui.

> Does the thing work without memory leakage (for a pass-by-ref datatype)
> in a GROUP BY situation?

I expected that using the AggState context would handle free'ing the
memory at the end of the aggregation as I believe that's the context
used for the state variable itself as well.  Honestly, I wasn't entirely
sure how to create my own context or if that was the correct thing to do
in this case.  I'll see about changing it around to do that if it's
acceptable to have a context created for each instance of a group-by
aggregate, and I can figure out how. :)

I'm not sure about memory leakage during a Sort+GroupAgg..  I don't know
how that's done currently either, honestly.  It seems like memory could
be free'd during that once the ffunc is called, and an extra memory
context could do that, is that what you're referring to?
Thanks!
    Stephen

pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: 8.2 translation status?
Next
From: Stephen Frost
Date:
Subject: Re: array_accum aggregate