Re: reduce many loosely related rows down to one - Mailing list pgsql-sql
From | Marc Mamin |
---|---|
Subject | Re: reduce many loosely related rows down to one |
Date | |
Msg-id | B6F6FD62F2624C4C9916AC0175D56D880CDF7FCE@jenmbs01.ad.intershop.net Whole thread Raw |
In response to | Re: reduce many loosely related rows down to one ("Torsten Grust" <torsten.grust@gmail.com>) |
List | pgsql-sql |
> SELECT id, > (array_agg(rspid))[1] AS rspid, -- (1) for such cases, I have created an new aggregate function: SELECT firstnotnull(rspid) AS rspid, this avoid to collect first all rspid values to then keep only the first one... CREATE OR REPLACE FUNCTION public.first_agg_nn ( anyelement, anyelement ) RETURNS anyelement AS $$ SELECT $1; $$ LANGUAGESQL IMMUTABLE STRICT COST 1; CREATE AGGREGATE public.firstnotnull ( sfunc = public.first_agg_nn, basetype = anyelement, stype = anyelement ); regards, Marc Mamin > -----Original Message----- > From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql- > owner@postgresql.org] On Behalf Of Torsten Grust > Sent: Dienstag, 28. Mai 2013 17:08 > To: pgsql-sql@postgresql.org > Subject: Re: [SQL] reduce many loosely related rows down to one > > On 25 May 2013, at 9:19, Bill MacArthur wrote (with possible > deletions): > > [...] > > select * from test; > > > > id | rspid | nspid | cid | iac | newp | oldp | ppv | tppv > > ----+-------+-------+-----+-----+------+------+---------+--------- > > 1 | 2 | 3 | 4 | t | | | | > > 1 | 2 | 3 | | | 100 | | | > > 1 | 2 | 3 | | | | 200 | | > > 1 | 2 | 3 | | | | | | 4100.00 > > 1 | 2 | 3 | | | | | | 3100.00 > > 1 | 2 | 3 | | | | | -100.00 | > > 1 | 2 | 3 | | | | | 250.00 | > > 2 | 7 | 8 | 4 | | | | | > > (8 rows) > > > > -- I want this result (where ppv and tppv are summed and the other > > distinct values are boiled down into one row) > > -- I want to avoid writing explicit UNIONs that will break if, say > the > > "cid" was entered as a discreet row from the row containing "iac" > > -- in this example "rspid" and "nspid" are always the same for a > given > > ID, however they could possibly be absent for a given row as well > > > > id | rspid | nspid | cid | iac | newp | oldp | ppv | tppv > > ----+-------+-------+-----+-----+------+------+---------+--------- > > 1 | 2 | 3 | 4 | t | 100 | 200 | 150.00 | 7200.00 > > 2 | 7 | 8 | 4 | | | | 0.00 | 0.00 > > One possible option could be > > SELECT id, > (array_agg(rspid))[1] AS rspid, -- (1) > (array_agg(nspid))[1] AS nspid, > (array_agg(cid))[1] AS cid, > bool_or(iac) AS iac, -- (2) > max(newp) AS newp, -- (3) > min(oldp) AS oldp, -- (4) > coalesce(sum(ppv), 0) AS ppv, > coalesce(sum(tppv),0) AS tppv > FROM test > GROUP BY id; > > > This query computes the desired output for your example input. > > There's a caveat here: your description of the problem has been > somewhat vague and it remains unclear how the query should respond if > the functional dependency id -> rspid does not hold. In this case, the > array_agg(rspid)[1] in the line marked (1) will pick one among many > different(!) rspid values. > I don't know your scenario well enough to judge whether this would be > an acceptable behavior. Other possible behaviors have been implemented > in the lines (2), (3), (4) where different aggregation functions are > used to reduce sets to a single value (e.g., pick the largest/smallest > of many values ...). > > Cheers, > --Torsten > > > -- > | Torsten "Teggy" Grust > | Torsten.Grust@gmail.com > > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) To make > changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql