Re: complex custom aggregate function - Mailing list pgsql-general

From Scara Maccai
Subject Re: complex custom aggregate function
Date
Msg-id 481041.38433.qm@web24607.mail.ird.yahoo.com
Whole thread Raw
In response to complex custom aggregate function  (Scara Maccai <m_lists@yahoo.it>)
List pgsql-general
I think I've found a solution myself to the moving average problem, so I'm posting it here in case it works for some
others(and so that everybody can check that I'm doing it right and in the "best" way...) 
Basically I'm
1) saving all the couples (timestamp, double) of an aggregation into an array using array_accum
2) unnesting the same array to get a table of (timestamp, double) tuples
3) calculating the moving average using the "self join" trick (found in "Transact SQL") of the table got from 2)
4) getting the max out of the average values.
Before the code, some questions:
1) I tried doing it using python, but the array was passed as a string (I didn't find it in the docs though...)
2) I didn't understand why the "array_enum" function is present only for int arrays in the contrib "intagg" module;
couldn'tit be a  function with "anyarray"  as argument? Such as: 
CREATE OR REPLACE FUNCTION array_enum(anyarray)
RETURNS setof anyelement
AS 'array_unnest'
LANGUAGE INTERNAL IMMUTABLE STRICT;


3) Am I doing something wrong in the code below? Could I do it better?
CREATE AGGREGATE array_accum (anyelement)
(
    sfunc = array_append,
    stype = anyarray,
    initcond = '{}'
);
CREATE TYPE timemax_t AS (
    t       timestamp,
    v       double precision
);
CREATE OR REPLACE FUNCTION timemax_array_enum(timemax_t[])
RETURNS setof timemax_t
AS 'array_unnest'
LANGUAGE INTERNAL IMMUTABLE STRICT;


CREATE or replace FUNCTION movingavgmax(a timemax_t[], nquarters int4)
  RETURNS double precision
AS $$
DECLARE
    timemax_t_row timemax_t;
    retval    double precision;
BEGIN
select max(mx) into retval from
(
 select x..t, AVG(y.v) as mx
 from timemax_array_enum(a) as x, timemax_array_enum(a) as y
 where
 x.t between y.t and y.t+((nquarters-1)*15||' minutes')::interval
 group by x.t
 having count(y.v)=nquarters
 ) as subs;
return retval;
END;
$$ LANGUAGE plpgsql;


-- example usage:
select movingavgmax(array_accum((quarter,value)::timemax_t), 3) from test where id = 10 AND quarter between '2008-12-01
00:00:00'and '2008-12-01 10:00:00' ; 

----- Messaggio originale -----
> Da: Scara Maccai <m_lists@yahoo.it>
> A: pgsql-general@postgresql.org
> Inviato: Venerdì 30 gennaio 2009, 9:35:53
> Oggetto: complex custom aggregate function
>
> Hi all,
>
> I have a table like:
>
> value int,
> quarter timestamp
>
> I need an aggregate function that gives back the maximum "value" using
> this algorithm:
>
> AVG of the first hour (first 4 quarters) (AVG0)
> same as above, but 1 quarter later (AVG1)
> ...
> same as above, but n quarters later (AVGn)
>
> result: the quarter where AVGn was MAX.
>
> Example:
>
> quarter        value        AVGn
>
> 2008-01-01 00:00     10
> 2008-01-01 00:15     15
> 2008-01-01 00:30     5
> 2008-01-01 00:45     20    -> 12.5 ((10+15+5+20)/4)
> 2008-01-01 01:15     2    -> 21    ((15+5+20+2)/4)
> 2008-01-01 01:30     30    -> 14.25 ((5+20+2+30)/4))
>
> the result should be ('2008-01-01 00:15', 21)
>
>
>
> It would be very easy if the input to the custom aggregate function was
> ordered (because I would keep 4 internal counters), but I guess there's
> no way of "forcing" the ordering of the input to the function, right?
>
> So I have to cache all the (quarter,value) couples and give back a
> result at the end, right?



      Passa a Yahoo! Mail.

La webmail che ti offre GRATIS spazio illimitato,
antispam e messenger integrato.
http://it.mail.yahoo.com/              


pgsql-general by date:

Previous
From: Grzegorz Jaśkiewicz
Date:
Subject: Re: Out of memory on SELECT in 8.3.5
Next
From: John R Pierce
Date:
Subject: Re: Out of memory on SELECT in 8.3.5