Re: multi-column aggregates - Mailing list pgsql-general

From Chris Kratz
Subject Re: multi-column aggregates
Date
Msg-id 200603131421.39910.chris.kratz@vistashare.com
Whole thread Raw
In response to Re: multi-column aggregates  (Berend Tober <btober@seaworthysys.com>)
Responses Re: multi-column aggregates
List pgsql-general
Hello Berend,

Thanks for the reply.  Yes, subselects would work very well and in some ways
are more elegant then the hand waving we had to do to get the multi-column
aggregates to work.  The reason we moved away from the subselects is that the
queries tend to be quite complex and all of the joins and where clause and
other odds and ends which are used to limit the parent select would need to
be in each subselect for it to accurately reflect the first or last within
the record set shown.  My example didn't really reflect the fact that the
parent query is often quite complex with multiple joins, subselects of it's
own, etc.

Probably the best thing would be to use intermediate temp tables to get the
rowset that needs grouping and then use subselects to get the appropriate
first or last item in each case.  I was just hoping we could come up with a
aggregate so we could easily slip it into place beside the normal aggregates
provided (sum, avg, count, etc) by the application.

-Chris

On Thursday 09 March 2006 03:20 pm, Berend Tober wrote:
> I may not fully understand your situation but
>
> SELECT distinct grouping,
> (SELECT integer_column FROM test.test_agg_last WHERE
> grouping=t1.grouping ORDER BY cur_date DESC LIMIT 1) AS last_int,
> (SELECT integer_column FROM test.test_agg_last WHERE
> grouping=t1.grouping ORDER BY cur_date ASC LIMIT 1) AS first_int,
> (SELECT integer_column FROM test.test_agg_last WHERE
> grouping=t1.grouping ORDER BY cur_date2 DESC LIMIT 1) AS last_int2,
> (SELECT integer_column FROM test.test_agg_last WHERE
> grouping=t1.grouping ORDER BY cur_date2 ASC LIMIT 1) AS first_int2
> FROM test.test_agg_last t1
>
> produced the same result as at least your first example:
>
> 1,8,78,78,8
> 2,32,32,32,32
>
> I think it does what you are trying to do and can be expanded to your
> other examples. Maybe not as cool as using composite aggregates, though.
>
> -- BMT
>
> Chris Kratz wrote:
> >Thanks Tom,
> >
> >Well for anyone else who may be interested in doing something similar,
> > here is what we did.  It does require typecasting going into the
> > functions, composite types and using the dot notation to get the value
> > back out of the composite object returned.  But it works.
> >
> >This is what we wanted...
> >
> >select last(cur_date, some_column) from some_table....
> >
> >We got this close...
> >
> >select (last((cur_date, some_column)::last_int_agg)).value as last_int
> > from...
> >
> >which I think will be useable for what we need.  If anyone has ideas to
> >simplify this, I would appreciate it.  Example and generation script
> > attached at end.
> >
> >I do have to say that the flexibility in postgres for creating our own
> > data types and aggregate functions is wonderfull.  Kudos again to
> > everyone who has but so much time and energy into postgres.
> >---
> >
> >-Chris
> >
> >On Thursday 09 March 2006 01:08 pm, Tom Lane wrote:
> >>Chris Kratz <chris.kratz@vistashare.com> writes:
> >>>Is there any way in postgres to have an aggregate that uses input from
> >>>two columns without using composite types?
> >>
> >>No.
> >>
> >>            regards, tom lane
> >
> >--------------------------------------------------------------------------
> >----------- simple example test data::
> >--------------------------------------------------------------------------
> >----------- test=# select id, grouping, cur_date::date, cur_date2::date,
> > integer_column from test_agg_last;
> > id | grouping |  cur_date  | cur_date2  | integer_column
> >----+----------+------------+------------+----------------
> >  1 |        1 | 2006-01-05 | 2006-01-03 |              8
> >  2 |        1 | 2006-01-01 | 2006-01-05 |             78
> >  3 |        2 | 2006-01-03 | 2006-01-01 |             32
> >(3 rows)
> >
> >test=# select
> >test-#    grouping,
> >test-#    (last((cur_date, integer_column)::last_int_agg)).value as
> > last_int, test-#    (first((cur_date,
> > integer_column)::last_int_agg)).value as first_int,
> >test-#    (last((cur_date2, integer_column)::last_int_agg)).value as
> >last_int2,
> >test-#    (first((cur_date2, integer_column)::last_int_agg)).value as
> >first_int2
> >test-# from test_agg_last
> >test-# group by grouping
> >test-# order by grouping
> >test-# ;
> > grouping | last_int | first_int | last_int2 | first_int2
> >----------+----------+-----------+-----------+------------
> >        1 |        8 |        78 |        78 |          8
> >        2 |       32 |        32 |        32 |         32
> >(2 rows)
> >
> >--------------------------------------------------------------------------
> >----------- First and Last aggregates using an arbitrary date column
> >--------------------------------------------------------------------------
> >----------- -- aggregate types
> >create type last_int_agg as (cur_date timestamp, value int);
> >create type last_txt_agg as (cur_date timestamp, value text);
> >create type last_rel_agg as (cur_date timestamp, value double precision);
> >create type last_num_agg as (cur_date timestamp, value numeric(12,2));
> >create type last_dte_agg as (cur_date timestamp, value date);
> >create type last_tme_agg as (cur_date timestamp, value time);
> >create type last_bln_agg as (cur_date timestamp, value boolean);
> >create type last_ntv_agg as (cur_date timestamp, value interval);
> >
> >-- generic last accumulator function
> >CREATE OR REPLACE function last_accum(anyelement, anyelement) returns
> >anyelement
> >   AS $$
> >   BEGIN
> >      IF $1.cur_date IS NULL and $2.cur_date IS NOT NULL
> >         THEN RETURN $2;
> >      ELSEIF $1.cur_date IS NOT NULL and $2.cur_date IS NULL
> >         THEN RETURN $1;
> >      ELSEIF $2.cur_date>$1.cur_date
> >         THEN RETURN $2;
> >      ELSE RETURN $1;
> >      END IF;
> >   END;
> >$$ LANGUAGE plpgsql;
> >
> >-- generic first accumulator function
> >CREATE OR REPLACE function first_accum(anyelement, anyelement) returns
> >anyelement
> >   AS $$
> >   BEGIN
> >      IF $1.cur_date IS NULL and $2.cur_date IS NOT NULL
> >         THEN RETURN $2;
> >      ELSEIF $1.cur_date IS NOT NULL and $2.cur_date IS NULL
> >         THEN RETURN $1;
> >      ELSEIF $2.cur_date<$1.cur_date
> >         THEN RETURN $2;
> >      ELSE RETURN $1;
> >      END IF;
> >   END;
> >$$ LANGUAGE plpgsql;
> >
> >-- last aggregate
> >CREATE AGGREGATE last (
> >    sfunc = last_accum,
> >    basetype = anyelement,
> >    stype = anyelement
> >);
> >
> >-- first aggregate
> >CREATE AGGREGATE first (
> >    sfunc = first_accum,
> >    basetype = anyelement,
> >    stype = anyelement
> >);
> >
> >-- test data
> >
> >create table test_agg_last(
> >   id serial primary key,
> >   grouping integer,
> >   cur_date timestamp,
> >   cur_date2 timestamp,
> >   integer_column integer,
> >   real_column double precision,
> >   currency_column numeric(12,2),
> >   text_column text,
> >   date_column date,
> >   time_column time without time zone,
> >   interval_column interval,
> >   boolean_column boolean);
> >
> >insert into test_agg_last(grouping, cur_date, cur_date2, integer_column,
> >real_column, currency_column, text_column, date_column, time_column,
> >interval_column, boolean_column)
> >values(1, '1/5/06', '1/3/06', 8, 38.7, '12.00', 'Four score', '12/3/78',
> >'3:32pm', '1 day', true);
> >
> >insert into test_agg_last(grouping, cur_date, cur_date2, integer_column,
> >real_column, currency_column, text_column, date_column, time_column,
> >interval_column, boolean_column)
> >values(1, '1/1/06', '1/5/06', 78, 1.998, '35000.00', 'and seven',
> > '6/18/05', '12:00am', '4 hours', false);
> >
> >innsert into test_agg_last(grouping, cur_date, cur_date2, integer_column,
> >real_column, currency_column, text_column, date_column, time_column,
> >interval_column, boolean_column)
> >values(2, '1/3/06', '1/1/06', 32, 0.000001, '100000.00', 'years ago',
> >'1/25/2010', '11:37am', '23 minutes', true);
> >
> >-- test using multiple date columns with first and last
> >select
> >   grouping,
> >   (last((cur_date, integer_column)::last_int_agg)).value as last_int,
> >   (first((cur_date, integer_column)::last_int_agg)).value as first_int,
> >   (last((cur_date2, integer_column)::last_int_agg)).value as last_int2,
> >   (first((cur_date2, integer_column)::last_int_agg)).value as first_int2
> >from test_agg_last
> >group by grouping
> >order by grouping
> >;
> >
> >-- test several different common types
> >select
> >   grouping,
> >   (last((cur_date, integer_column)::last_int_agg)).value as last_int,
> >   (first((cur_date, integer_column)::last_int_agg)).value as first_int,
> >   (last((cur_date, real_column)::last_rel_agg)).value as last_real,
> >   (first((cur_date, real_column)::last_rel_agg)).value as first_real,
> >   (last((cur_date, currency_column)::last_num_agg)).value as
> > last_currency, (first((cur_date, currency_column)::last_num_agg)).value
> > as first_currency, (last((cur_date, text_column)::last_txt_agg)).value as
> > last_text, (first((cur_date, text_column)::last_txt_agg)).value as
> > first_text, (last((cur_date, date_column)::last_dte_agg)).value as
> > last_date, (first((cur_date, date_column)::last_dte_agg)).value as
> > first_date, (last((cur_date, time_column)::last_tme_agg)).value as
> > last_time, (first((cur_date, time_column)::last_tme_agg)).value as
> > first_time, (last((cur_date, interval_column)::last_ntv_agg)).value as
> > last_interval, (first((cur_date, interval_column)::last_ntv_agg)).value
> > as first_interval, (last((cur_date, boolean_column)::last_bln_agg)).value
> > as last_boolean, (first((cur_date, boolean_column)::last_bln_agg)).value
> > as first_boolean
> >
> >from test_agg_last
> >group by grouping
> >order by grouping
> >;
> >
> >-- cleanup test data
> >
> >drop table test_agg_last;
> >
> >---------------------------(end of broadcast)---------------------------
> >TIP 5: don't forget to increase your free space map settings

pgsql-general by date:

Previous
From: Teodor Sigaev
Date:
Subject: Re: Tsearch2 ranking
Next
From: "Kevin Grittner"
Date:
Subject: Wisconsin Circuit Court Access (WCCA) on PostgreSQL