Thread: Creating an aggregate function

Creating an aggregate function

From
Roberto Mello
Date:
Hi all, 

This e-mail is long because I am providing lots of details on the
problem. It's an OpenACS module that we are porting so it'll have lots
of users, you can be sure.

I have this view that I am porting that is a UNION of two self OUTER
JOINed views. PG does not support UNIONs in views, so we thought of
creating our own aggregate functions to do the job. 

There are no examples for CREATE AGGREGATE in the docs, but we made it
work, at least partially, but it explodes a little after the first row
column. 

Here's the original Oracle query: 
  create or replace view cs_n_sessions_day_user as  select b.date_id, b.n_sessions_day, nvl(a.n_users, 0) as members,
                b.n_users as non_members  from cs_historical_visits_grouped a,cs_historical_visits_grouped b  where
b.n_sessions_day= a.n_sessions_day(+)                    and b.date_id = a.date_id(+)                    and 1 =
a.member_p(+)                   and b.member_p = 0  UNION  select a.date_id, a.n_sessions_day, a.n_users as members,
                nvl(b.n_users, 0) as non_members  from cs_historical_visits_grouped a, cs_historical_visits_grouped b
where a.n_sessions_day = b.n_sessions_day(+)                    and a.date_id = b.date_id(+)                    and
a.member_p= 1                    and 0 = b.member_p(+);
 
Here is our port of the cs_historical_visits_grouped view: 
  create view cs_historical_visits_grouped as   select date_id, n_sessions_day, not_null_integer_p(user_id) as
member_p,count(browser_id) as n_users   from cs_historical_visits   group by date_id, n_sessions_day;
 

not_null_integer_p is a function we created that basically does case
when user_id is null then 1 else 0 end, but for some reason PG wouldn't
take it in a view. 

Here's a helper view for our aggregate function: 
  create view cs_hist_visits_grouped_txt as   select date_id, n_sessions_day, member_p, n_users,
member_p::varchar|| ' ' || n_users::varchar as  member_p_n_users  from cs_historical_visits_grouped;
 

And here are the aggregate functions: 
  create function agg_if_member (integer, varchar)  returns integer as '  declare          a_int alias for $1;
a_varcharalias for $2;          member_p integer;          num_members integer;  begin          member_p :=
substr(a_varchar,1, 1)::integer;          num_members := substr(a_varchar, 3)::integer;          if member_p = 1 then
              return num_members;          else                  return a_int;          end if;  end;  ' language
'plpgsql';
  create function agg_if_not_member (integer, varchar)  returns integer as '  declare          a_int alias for $1;
   a_varchar alias for $2;          return_val integer;          member_p integer;          num_members integer;  begin
        member_p := substr(a_varchar, 1, 1)::integer;          num_members := substr(a_varchar, 3)::integer;
return_val:= 0;          if a_int is not null then             return_val := a_int;          end if;          if
member_p= 0 and num_members > a_int then          return_val := num_members;          end if;          return
return_val; end;  ' language 'plpgsql';
 
  create aggregate num_members_or_zero (        basetype = varchar,        stype1 = integer,        sfunc1 =
agg_if_member,       initcond1 = 0);
 
  create aggregate num_non_members_or_zero (         basetype = varchar,         stype1 = integer,         sfunc1 =
agg_if_not_member,        initcond1 = 0);
 

and our query was: 
    select date_id, n_sessions_day,    num_members_or_zero(member_p_n_users) as members,
num_non_members_or_zero(member_p_n_users)as non_members     from foobar     group by date_id, n_sessions_day;
 

Our fake table foobar had the same structure as our cs_hist_grouped_txt
view has, and this was the dummy data we had inserted to
play with: 

    date_id | n_sessions_day | member_p | n_users | member_p_n_users
---------+----------------+----------+---------+------------------          1 |            500 |        0 |      50 | 0
50          1 |            500 |        1 |      30 | 1 30           2 |           2000 |        1 |     210 | 1 210
      2 |           2000 |        0 |     999 | 0 999   
 

and our output was: 

     date_id | n_sessions_day |  members  | non_members    ---------+----------------+-----------+-------------
 1 |            500 |        30 |   136394044           2 |           2000 | 136394612 |   136394612 
 

the values for members is what we expected for date_id == 1, but the
other ones are all goofy. Anybody has any ideas or worked with CREATE
AGGREGATE before? 

Sorry about the long email and TIA. 
-Roberto Mello
-- 
Roberto Mello, rmello@cc.usu.edu - GNU/Linux Reg.User #96240 Computer Science - Utah State UniversityUSU Free Software
andGNU/Linux Club, Presidenthttp://fslc.usu.edu - http://www.brasileiro.net/roberto
 


Re: Creating an aggregate function

From
Tom Lane
Date:
Roberto Mello <rmello@cc.usu.edu> writes:
> and our output was: 
>       date_id | n_sessions_day |  members  | non_members
>      ---------+----------------+-----------+-------------
>             1 |            500 |        30 |   136394044
>             2 |           2000 | 136394612 |   136394612 

In current sources I get the expected results if I write the
CREATE AGGREGATE commands withinitcond1 = '0');
Although the parser will accept unquoted integers as initcond
inputs, it looks like something downstream is mishandling them,
resulting in an invalid initial value entered into the pg_aggregate
entry for the aggregate.

I'll look into fixing that for 7.1, but in the meantime try putting
quotes around the initial values.

BTW, your agg_if_member function is not very good, because if it gets
more than one input row with member_p = 1 then the result is order-
dependent.  You don't want that.
        regards, tom lane