Thread: Parameters in user-defined aggregate final functions

Parameters in user-defined aggregate final functions

From
Esteban Zimanyi
Date:
I am creating a user-defined aggregate function that needs an additional parameter. More precisely it is a cumulative (aka window) minimum that takes as second parameter a time interval defining the window. Since the aggregate function operates on my user-defined data types I have conveyed a dummy example that concatenates the n last values of a text column. I am aware that I can solve this dummy problem in PostgreSQL but the purpose of the example is only to highlight my problem.

CREATE FUNCTION lastNconcat_transfn(state text[], next text, n integer)
    RETURNS text[] AS $$
    BEGIN
        RETURN array_append(state, next);
    END;
    $$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;

CREATE FUNCTION lastNconcat_combinefn(state1 text[], state2 text[], n integer)
    RETURNS text[] AS $$
    BEGIN
        RETURN array_concat(state1, state2);
    END;
    $$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;

CREATE FUNCTION lastNconcat_finalfn(state text[], n integer)
    RETURNS text AS $$
    DECLARE
        card integer;
        result text;
    BEGIN
        result := '';
        card := array_length(state, 1);
        FOR i IN greatest(1,card-n+1)..card
        LOOP 
            result := result || state[i];
        END LOOP;
        RETURN result;
    END;
    $$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;

CREATE AGGREGATE lastNconcat(text, integer) (
    SFUNC = lastNconcat_transfn,
    STYPE = text[],
    INITCOND = '{}',
    COMBINEFUNC = lastNconcat_combinefn,
    FINALFUNC = lastNconcat_finalfn,
    PARALLEL = SAFE
);

I receive the following error

ERROR: function lastnconcat_finalfn(text[]) does not exist
SQL state: 42883

How to tell PostgreSQL that my final function also needs a parameter? I am working on PostgreSQL 10.1. I know that according to the documentation direct parameters are only allowed for ordered-set aggregates, but I would also need a direct parameter for "normal" aggregates.

Notice that the solution proposed here 
is neither ideal nor efficient.

IMHO since combine functions accept parameters I don't see why final functions should not also accept parameters.

-- 
------------------------------------------------------------
Prof. Esteban Zimanyi
Department of Computer & Decision Engineering  (CoDE) CP 165/15    
Universite Libre de Bruxelles            
Avenue F. D. Roosevelt 50                
B-1050 Brussels, Belgium                 
fax: + 32.2.650.47.13
tel: + 32.2.650.31.85
e-mail: ezimanyi@ulb.ac.be
Internet: http://code.ulb.ac.be/
------------------------------------------------------------

Re: Parameters in user-defined aggregate final functions

From
David Fetter
Date:
On Thu, Jan 11, 2018 at 08:51:27PM +0100, Esteban Zimanyi wrote:
> I am creating a user-defined aggregate function that needs an additional
> parameter. More precisely it is a cumulative (aka window) minimum that
> takes as second parameter a time interval defining the window. Since the
> aggregate function operates on my user-defined data types I have conveyed a
> dummy example that concatenates the n last values of a text column. I am
> aware that I can solve this dummy problem in PostgreSQL but the purpose of
> the example is only to highlight my problem.
> 
> CREATE FUNCTION lastNconcat_transfn(state text[], next text, n integer)
>     RETURNS text[] AS $$
>     BEGIN
>         RETURN array_append(state, next);
>     END;
>     $$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;
> 
> CREATE FUNCTION lastNconcat_combinefn(state1 text[], state2 text[], n
> integer)
>     RETURNS text[] AS $$
>     BEGIN
>         RETURN array_concat(state1, state2);
>     END;
>     $$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;
> 
> CREATE FUNCTION lastNconcat_finalfn(state text[], n integer)
>     RETURNS text AS $$
>     DECLARE
>         card integer;
>         result text;
>     BEGIN
>         result := '';
>         card := array_length(state, 1);
>         FOR i IN greatest(1,card-n+1)..card
>         LOOP
>             result := result || state[i];
>         END LOOP;
>         RETURN result;
>     END;
>     $$ LANGUAGE plpgsql IMMUTABLE STRICT PARALLEL SAFE;
> 
> CREATE AGGREGATE lastNconcat(text, integer) (
>     SFUNC = lastNconcat_transfn,
>     STYPE = text[],
>     INITCOND = '{}',
>     COMBINEFUNC = lastNconcat_combinefn,
>     FINALFUNC = lastNconcat_finalfn,
>     PARALLEL = SAFE
> );
> 
> I receive the following error
> 
> ERROR: function lastnconcat_finalfn(text[]) does not exist
> SQL state: 42883
> 
> How to tell PostgreSQL that my final function also needs a parameter? I am
> working on PostgreSQL 10.1. I know that according to the documentation
> direct parameters are only allowed for ordered-set aggregates, but I would
> also need a direct parameter for "normal" aggregates.
> 
> Notice that the solution proposed here
>
https://stackoverflow.com/questions/48189751/direct-arguments-in-postgresql-user-defined-aggregate-functions/48190288?noredirect=1#comment83364017_48190288
> is neither ideal nor efficient.
> 
> IMHO since combine functions accept parameters I don't see why final
> functions should not also accept parameters.

This is an interesting problem.  In CREATE AGGREGATE, I count 10
parameters that could easily have a function attached. One could
imagine an aggregate which took different parameters at each stage,
but is there really any sane way to do this other than making a call
to the aggregate with those parameters all included, passing each
along as one goes?

SELECT my_custom_agg(expression) WITH ([finalfunc_args = ...][, finalfunc_extra_args = ...]...)

is what I could come up with.  It seems ugly as grammar and
ill-advised in that it makes promises about the implementation details
of aggregates into a distant future.

What am I missing?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: Parameters in user-defined aggregate final functions

From
Tom Lane
Date:
Esteban Zimanyi <ezimanyi@ulb.ac.be> writes:
> How to tell PostgreSQL that my final function also needs a parameter? I am
> working on PostgreSQL 10.1. I know that according to the documentation
> direct parameters are only allowed for ordered-set aggregates, but I would
> also need a direct parameter for "normal" aggregates.

So define it as an ordered-set aggregate, and just ignore the question
of whether you need to sort the input (which is something that we leave
to the aggregate function to do anyway).  The syntax would be a little
weird/non-orthogonal, but you can blame the SQL committee for that.

regression=# create function trans(int, int) returns int language sql
regression-# as 'select $1+$2' strict;
CREATE FUNCTION
regression=# create function final(int, float8) returns float8 language sql
regression-# as 'select $1*$2' strict;
CREATE FUNCTION
regression=# create aggregate myosa(float8 order by int) (
regression(# sfunc = trans, stype = int, finalfunc = final);
CREATE AGGREGATE
regression=# select sum(ten), myosa(0.5) within group (order by ten) from tenk1;
  sum  | myosa
-------+-------
 45000 | 22500
(1 row)


            regards, tom lane


Re: Parameters in user-defined aggregate final functions

From
Robert Haas
Date:
On Thu, Jan 11, 2018 at 6:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Esteban Zimanyi <ezimanyi@ulb.ac.be> writes:
>> How to tell PostgreSQL that my final function also needs a parameter? I am
>> working on PostgreSQL 10.1. I know that according to the documentation
>> direct parameters are only allowed for ordered-set aggregates, but I would
>> also need a direct parameter for "normal" aggregates.
>
> So define it as an ordered-set aggregate, and just ignore the question
> of whether you need to sort the input (which is something that we leave
> to the aggregate function to do anyway).  The syntax would be a little
> weird/non-orthogonal, but you can blame the SQL committee for that.

Or alternatively, don't define a final function at all, or define one
that just serializes the transition state to JSON or whatever.  Then
define some completely separate function that takes the transition
state (or the serialized representation thereof) and the additional
parameters and write something like:

SELECT completely_separate_nonaggregate_function(not_quite_the_aggregate_i_really_want(stuff),
42, 'omaha') FROM my_table;

Like Tom's proposal that's syntactically different but it should be
close enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Parameters in user-defined aggregate final functions

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jan 11, 2018 at 6:11 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So define it as an ordered-set aggregate, and just ignore the question
>> of whether you need to sort the input (which is something that we leave
>> to the aggregate function to do anyway).  The syntax would be a little
>> weird/non-orthogonal, but you can blame the SQL committee for that.

> Or alternatively, don't define a final function at all, or define one
> that just serializes the transition state to JSON or whatever.

A third possibility, which preserves notational simplicity at some
cost, is just to define the aggregate as taking the additional
parameter(s) as regular aggregate inputs.  The transition function
would simply remember the last (or first) values of those parameters
as part of the transition state, and the final function would use
them from there.

The costs of this are:

1. you'd evaluate the additional params again at each row.  I think
this is probably not a big deal if they're just constants, but YMMV.

2. if the aggregate executes over zero input rows, you don't get an
opportunity to collect the extra params at all.  This might be fatal,
but it could also be a nonissue, either because you know your
application never aggregates over no rows, or because the correct
answer would be NULL or some such regardless of the extra params.

So there's more than one way to do it ...

            regards, tom lane