Thread: array_accum aggregate
Greetings, The array_accum example aggregate in the user documentation works reasonably on small data sets but doesn't work too hot on large ones. http://www.postgresql.org/docs/8.1/static/xaggr.html Normally I wouldn't care particularly much but it turns out that PL/R uses arrays for quite a bit (eg: histograms and other statistics functions). I've also heard other complaints about the performance of arrays, though I'm not sure if those were due to array_accum or something else. Long story short, I set out to build a faster array_accum. Much to my suprise and delight, we already *had* one. accumArrayResult() and makeArrayResult()/construct_md_array() appear to do a fantastic job. I've created a couple of 'glue' functions to expose these functions so they can be used in an aggregate. I'm sure they could be improved upon and possibly made even smaller than they already are (90 lines total for both) but I'd like to throw out the idea of including them in core. The aggregate created with them could also be considered for inclusion though I'm less concerned with that. I don't expect general PostgreSQL users would have trouble creating the aggregate- I don't know that the average user would be able or willing to write the C functions. For comparison, the new functions run with: time psql -c "select aaccum(generate_series) from generate_series(1,1000000);" > /dev/null 4.24s real 0.34s user 0.06s system Compared to: time psql -c "select array_accum(generate_series) from generate_series(1,1000000);" > /dev/null ... Well, it's still running and it's been over an hour. The main differences, as I see it, are: accumArrayResult() works in chunks of 64 elements, and uses repalloc(). array_accum uses array_set() which works on individual elements and uses palloc()/memcpy(). I appriciate that this is done because for most cases of array_set() it's not acceptable to modify the input and am not suggesting that be changed. An alternative might be to modify array_set() to check if it is in an aggregate and change its behavior but adding the seperate functions seemed cleaner and much less intrusive to me. Please find the functions attached. Thanks, Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > Long story short, I set out to build a faster array_accum. Much to my > suprise and delight, we already *had* one. accumArrayResult() and > makeArrayResult()/construct_md_array() appear to do a fantastic job. > I've created a couple of 'glue' functions to expose these functions so > they can be used in an aggregate. I'm sure they could be improved > upon and possibly made even smaller than they already are (90 lines > total for both) but I'd like to throw out the idea of including them > in core. The aggregate created with them could also be considered for > inclusion though I'm less concerned with that. Since you've set up the functions to only be usable inside an aggregate, I don't see much of a reason why we wouldn't provide the aggregate too. It looks like it should work to have just one polymorphic aggregate definition, eg, array_accum(anyelement) returns anyarray. As far as coding style goes, you're supposed to use makeArrayResult() with accumArrayResult(), not call construct_md_array() directly. And copying the ArrayBuildState around like that is just plain bizarre... Does the thing work without memory leakage (for a pass-by-ref datatype) in a GROUP BY situation? regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > Long story short, I set out to build a faster array_accum. Much to my > > suprise and delight, we already *had* one. accumArrayResult() and > > makeArrayResult()/construct_md_array() appear to do a fantastic job. > > I've created a couple of 'glue' functions to expose these functions so > > they can be used in an aggregate. I'm sure they could be improved > > upon and possibly made even smaller than they already are (90 lines > > total for both) but I'd like to throw out the idea of including them > > in core. The aggregate created with them could also be considered for > > inclusion though I'm less concerned with that. > > Since you've set up the functions to only be usable inside an aggregate, > I don't see much of a reason why we wouldn't provide the aggregate too. Sure, I don't see a reason these functions would be useful outside of an aggregate in the form they need to be in for the aggregate, either.. > It looks like it should work to have just one polymorphic aggregate > definition, eg, array_accum(anyelement) returns anyarray. I was hoping to do that, but since it's an aggregate the ffunc format is pre-defined to require accepting the 'internal state' and nothing else, and to return 'anyelement' or 'anyarray' one of the inputs must be an 'anyelement' or 'anyarray', aiui. That also implied to me that the type passed in was expected to be the type passed out, which wouldn't necessairly be the case here as the internal state variable is a bytea. It might be possible to do away with the internal state variable entirely though and make it an anyelement instead, if we can find somewhere else to put the pointer to the ArrayBuildState. > As far as coding style goes, you're supposed to use makeArrayResult() > with accumArrayResult(), not call construct_md_array() directly. And > copying the ArrayBuildState around like that is just plain bizarre... I had attempted to use makeArrayResult() originally and ran into some trouble with the MemoryContextDelete() which is done during it. The context used was the AggState context and therefore was deleted elsewhere. That might have been a misunderstanding on my part though since I recall fixing at least one or two bugs afterwards, so it may be possible to go back and change to using makeArrayResult(). That'd certainly make the ffunc smaller. As for ArrayBuildState, I'm not actually copying the structure around, just the pointer is being copied around inside of the state transistion variable (which is a bytea). I'm not sure where else to store the pointer to the ArrayBuildState though, since multiple could be in play at a given time during an aggregation, aiui. > Does the thing work without memory leakage (for a pass-by-ref datatype) > in a GROUP BY situation? I expected that using the AggState context would handle free'ing the memory at the end of the aggregation as I believe that's the context used for the state variable itself as well. Honestly, I wasn't entirely sure how to create my own context or if that was the correct thing to do in this case. I'll see about changing it around to do that if it's acceptable to have a context created for each instance of a group-by aggregate, and I can figure out how. :) I'm not sure about memory leakage during a Sort+GroupAgg.. I don't know how that's done currently either, honestly. It seems like memory could be free'd during that once the ffunc is called, and an extra memory context could do that, is that what you're referring to? Thanks! Stephen
* Stephen Frost (sfrost@snowman.net) wrote: > For comparison, the new functions run with: > time psql -c "select aaccum(generate_series) from generate_series(1,1000000);" > /dev/null > 4.24s real 0.34s user 0.06s system > > Compared to: > time psql -c "select array_accum(generate_series) from generate_series(1,1000000);" > /dev/null > ... > > Well, it's still running and it's been over an hour. Just to follow-up on this, the result was: 7601.36s real 0.36s user 0.02s system Or about 2 hours. Enjoy, Stephen
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> It looks like it should work to have just one polymorphic aggregate >> definition, eg, array_accum(anyelement) returns anyarray. > I was hoping to do that, but since it's an aggregate the ffunc format is > pre-defined to require accepting the 'internal state' and nothing else, > and to return 'anyelement' or 'anyarray' one of the inputs must be an > 'anyelement' or 'anyarray', aiui. Hmm ... I hadn't been thinking about what the state type would need to be, but certainly "bytea" is a lie given what you're really doing. We've run into this same problem in contrib/intagg: sometimes you'd like to use a state data structure that isn't any regular SQL datatype, and in particular isn't just a single blob of memory. That's a problem from nodeAgg's point of view because it expects to be responsible for copying the state value from one context to another. Don't have an immediate idea for a solution ... regards, tom lane
Tom Lane wrote: > Stephen Frost <sfrost@snowman.net> writes: >> * Tom Lane (tgl@sss.pgh.pa.us) wrote: >>> It looks like it should work to have just one polymorphic aggregate >>> definition, eg, array_accum(anyelement) returns anyarray. > >> I was hoping to do that, but since it's an aggregate the ffunc format is >> pre-defined to require accepting the 'internal state' and nothing else, >> and to return 'anyelement' or 'anyarray' one of the inputs must be an >> 'anyelement' or 'anyarray', aiui. > > Hmm ... I hadn't been thinking about what the state type would need to > be, but certainly "bytea" is a lie given what you're really doing. > We've run into this same problem in contrib/intagg: sometimes you'd like > to use a state data structure that isn't any regular SQL datatype, and > in particular isn't just a single blob of memory. That's a problem from > nodeAgg's point of view because it expects to be responsible for copying > the state value from one context to another. Don't have an immediate > idea for a solution ... I used an int8 as state type for an implementation of a kind of array_accum_unique aggregate. I know that it's a ugly hack, but it has been running on a production machine for months now, without a problem... The memory is alloc'd from the aggcontext, btw. Note that I only use this aggregate in one particular query - so there might be problems with my approach that just don't manifest in my particular situation. For example, the aggregate is used only on a table that is never updated, and it is only used in select queries. So there might be problems if the executor decides that it has to restart a query... greetings, Florian Pflug
* Tom Lane (tgl@sss.pgh.pa.us) wrote: > Stephen Frost <sfrost@snowman.net> writes: > > I was hoping to do that, but since it's an aggregate the ffunc format is > > pre-defined to require accepting the 'internal state' and nothing else, > > and to return 'anyelement' or 'anyarray' one of the inputs must be an > > 'anyelement' or 'anyarray', aiui. > > Hmm ... I hadn't been thinking about what the state type would need to > be, but certainly "bytea" is a lie given what you're really doing. Indeed. I've updated the functions quite a bit to clean things up, including: Added many more comments, removed the unnecessary 'storage*' pointer being used, created my own structure for tracking state information, created a seperate memory context (tied to the AggContext), correctly handle NULL values, and changed the ffunc to use makeArrayResult. I also tried just tried using polymorphic types for the functions and for the aggregate and it appeared to just work: create function aaccum_sfunc (anyarray, anyelement) returns anyarray language 'C' AS 'aaccum.so', 'aaccum_sfunc' ; create function aaccum_ffunc (anyarray) returns anyarray language 'C' AS '/data/sfrost/postgres/arrays/aaccum.so', 'aaccum_ffunc' ; create aggregate aaccum ( sfunc = aaccum_sfunc, basetype = anyelement, stype = anyarray, finalfunc = aaccum_ffunc ); select aaccum(generate_series) from generate_series(1,5); aaccum ------------- {1,2,3,4,5} (1 row) (test is a table with one varchar column, abc) select aaccum(abc) from test; aaccum --------- {a,b,c} (1 row) (Added a column called 'hi', set to 'a', added b,b and c,b) select hi,aaccum(abc) from test group by hi; hi | aaccum ----+--------- b | {b,c} a | {a,b,c} (2 rows) It makes some sense that it would work as an 'anyarray' is just a variable-length type internally and so long as nothing else attempts to make sense out of our 'fake array' everything should be fine. The latest version also appears to be a bit faster than the prior version. I'm going to be running a very large query shortly using this aaccum and will report back how it goes. Please let me know if there are any other improvments or changes I should make. I'd like to submit this to -patches w/ the appropriate entries to have it be included in the core distribution. Is it acceptable to reuse the 'array_accum' name even though it was used in the documentation as an example? I'm thinking yes, but wanted to check. Thanks! Stephen
Attachment
* Stephen Frost (sfrost@snowman.net) wrote: > I'm going to be running a very large query shortly using > this aaccum and will report back how it goes. It went *very* well, actually much better than I had originally expected. This query used to take over 12 hours to complete (about 11 of which was after the main table involved was sorted). With this new aaccum in place the whole query only took about an hour, most of which was the sort and join required by the query. The aggregation (aaccum) and r_hist() (R histogram function generating PNGs) took only a few minutes. Thanks! Stephen
On 10/10/06, Stephen Frost <sfrost@snowman.net> wrote: > * Stephen Frost (sfrost@snowman.net) wrote: > > I'm going to be running a very large query shortly using > > this aaccum and will report back how it goes. > > It went *very* well, actually much better than I had originally > expected. This query used to take over 12 hours to complete (about 11 > of which was after the main table involved was sorted). With this new > aaccum in place the whole query only took about an hour, most of which > was the sort and join required by the query. The aggregation (aaccum) > and r_hist() (R histogram function generating PNGs) took only a few > minutes. very cool, and definately useful imo. i use array_accum all the time, and have not always been happy with its performance. if your stuff passes muster, would be nice to see it move into core :-). merlin