Re: array_accum aggregate - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: array_accum aggregate
Date
Msg-id 20061009165547.GS24675@kenobi.snowman.net
Whole thread Raw
In response to Re: array_accum aggregate  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: array_accum aggregate  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > I was hoping to do that, but since it's an aggregate the ffunc format is
> > pre-defined to require accepting the 'internal state' and nothing else,
> > and to return 'anyelement' or 'anyarray' one of the inputs must be an
> > 'anyelement' or 'anyarray', aiui.
>
> Hmm ... I hadn't been thinking about what the state type would need to
> be, but certainly "bytea" is a lie given what you're really doing.

Indeed.  I've updated the functions quite a bit to clean things up,
including: Added many more comments, removed the unnecessary 'storage*'
pointer being used, created my own structure for tracking state
information, created a seperate memory context (tied to the AggContext),
correctly handle NULL values, and changed the ffunc to use
makeArrayResult.

I also tried just tried using polymorphic types for the functions and
for the aggregate and it appeared to just work:

    create function aaccum_sfunc (anyarray, anyelement) returns anyarray
        language 'C' AS 'aaccum.so', 'aaccum_sfunc'
    ;
    create function aaccum_ffunc (anyarray) returns anyarray language
        'C' AS '/data/sfrost/postgres/arrays/aaccum.so', 'aaccum_ffunc'
    ;
    create aggregate aaccum (
        sfunc = aaccum_sfunc,
        basetype = anyelement,
        stype = anyarray,
        finalfunc = aaccum_ffunc
    );

    select aaccum(generate_series) from generate_series(1,5);
       aaccum
    -------------
     {1,2,3,4,5}
    (1 row)

    (test is a table with one varchar column, abc)
    select aaccum(abc) from test;
     aaccum
    ---------
     {a,b,c}
    (1 row)

    (Added a column called 'hi', set to 'a', added b,b and c,b)
    select hi,aaccum(abc) from test group by hi;
     hi | aaccum
    ----+---------
     b  | {b,c}
     a  | {a,b,c}
    (2 rows)

It makes some sense that it would work as an 'anyarray' is just a
variable-length type internally and so long as nothing else attempts to
make sense out of our 'fake array' everything should be fine.

The latest version also appears to be a bit faster than the prior
version.  I'm going to be running a very large query shortly using
this aaccum and will report back how it goes.  Please let me know if
there are any other improvments or changes I should make.  I'd like to
submit this to -patches w/ the appropriate entries to have it be
included in the core distribution.  Is it acceptable to reuse the
'array_accum' name even though it was used in the documentation as an
example?  I'm thinking yes, but wanted to check.

    Thanks!

        Stephen

Attachment

pgsql-hackers by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: pg_dump exclusion switches and functions/types
Next
From: "Jim C. Nasby"
Date:
Subject: Re: pg_dump exclusion switches and functions/types