Thread: (P)SQL for a sum with constraints
Hello all, I have a table with 2 bigint columns, let's call them A and B. I need a query that will allow me to return A alongside the sum of Bs from rows where A is less than or equal to this row's A. It is best described with some example data, consider the following: A B 1 0 2 1 3 0 4 2 5 1 I want to be able to make a query that returns the following rows: A funkySumB 1 0 2 1 3 1 4 3 5 4 Anyone have any ideas how to do this? I'm able to do it programmatically, but it's slow. Optimally I'd like to be able to do this in the DB. As you can see, it's a little trickier that the usual aggregate function with a GROUP BY and HAVING. -- Shug
On Wed, Apr 15, 2009 at 11:09:49AM +0100, Shug Boabby wrote: > Anyone have any ideas how to do this? I'm able to do it > programmatically, but it's slow. Optimally I'd like to be able to do > this in the DB. As you can see, it's a little trickier that the usual > aggregate function with a GROUP BY and HAVING. this is called cumulative sum, and can be done quite easily. check this blogpost (but also its comments! - the c function is not necessary!) http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/ depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
In response to Shug Boabby : > Hello all, > > I have a table with 2 bigint columns, let's call them A and B. I need > a query that will allow me to return A alongside the sum of Bs from > rows where A is less than or equal to this row's A. It is best > described with some example data, consider the following: > > A B > 1 0 > 2 1 > 3 0 > 4 2 > 5 1 > > I want to be able to make a query that returns the following rows: > > A funkySumB > 1 0 > 2 1 > 3 1 > 4 3 > 5 4 > wait for 8.4: test=# select * from shug ; a | b ---+--- 1 | 0 2 | 1 3 | 0 4 | 2 5 | 1 (5 rows) test=# select a, sum(b) over (order by a) from shug ; a | sum ---+----- 1 | 0 2 | 1 3 | 1 4 | 3 5 | 4 (5 rows) Andreas -- Andreas Kretschmer Kontakt: Heynitz: 035242/47150, D1: 0160/7141639 (mehr: -> Header) GnuPG-ID: 0x3FFF606C, privat 0x7F4584DA http://wwwkeys.de.pgp.net
I simplified my problem a little too much and now I'm stuck trying to use cumulative_sum(). My schema is not only A, B but also has a C A B C 1 0 1 2 1 1 3 0 1 4 2 1 5 1 1 1 0 2 2 1 2 3 0 2 4 2 2 5 1 2 and I want to be able to do the cumulative sum only when C is the same. E.g. A funkySumB C 1 0 1 2 1 1 3 1 1 4 3 1 5 4 1 1 0 2 2 1 2 3 1 2 4 3 2 5 4 2 Also... could you please explain why the syntax requires the "user_id::text" to be passed to cumulative sum? I'm confused why the "::text" part is there. 2009/4/15 hubert depesz lubaczewski <depesz@depesz.com>: > On Wed, Apr 15, 2009 at 01:37:45PM +0100, Shug Boabby wrote: >> Thanks, I'll look at this... I still have to define cumulative_sum >> though, right? > > yes. it's just the c function that is obsolete. > > Best regards, > > depesz
On Wed, 15 Apr 2009 15:32:42 +0100, Shug Boabby wrote: > I simplified my problem a little too much and now I'm stuck trying to > use cumulative_sum(). My schema is not only A, B but also has a C > > A B C > 1 0 1 > 2 1 1 > 3 0 1 > 4 2 1 > 5 1 1 > 1 0 2 > 2 1 2 > 3 0 2 > 4 2 2 > 5 1 2 > > and I want to be able to do the cumulative sum only when C is the same. E.g. > > A funkySumB C > 1 0 1 > 2 1 1 > 3 1 1 > 4 3 1 > 5 4 1 > 1 0 2 > 2 1 2 > 3 1 2 > 4 3 2 > 5 4 2 If I understand the problem correctly, why not just something like this?: SELECT t1.a, sum(t2.b), t1.c FROM abc t1 JOIN abc t2 ON t2.a <= t1.a AND t2.c=t1.c GROUP BY t1.a, t1.c; -- Michał Politowski Talking has been known to lead to communication if practiced carelessly.
Life sure would be easier if that were the case Michal, but no... that is not the case here. The sum is not a simple sum, it is a sum of all elements having a lower or equal A and the same C. This is a "cumulative sum" as pointed out by others. 2009/4/15 Michal Politowski <mpol+pg@meep.pl>: > On Wed, 15 Apr 2009 15:32:42 +0100, Shug Boabby wrote: >> I simplified my problem a little too much and now I'm stuck trying to >> use cumulative_sum(). My schema is not only A, B but also has a C >> >> A B C >> 1 0 1 >> 2 1 1 >> 3 0 1 >> 4 2 1 >> 5 1 1 >> 1 0 2 >> 2 1 2 >> 3 0 2 >> 4 2 2 >> 5 1 2 >> >> and I want to be able to do the cumulative sum only when C is the same. E.g. >> >> A funkySumB C >> 1 0 1 >> 2 1 1 >> 3 1 1 >> 4 3 1 >> 5 4 1 >> 1 0 2 >> 2 1 2 >> 3 1 2 >> 4 3 2 >> 5 4 2 > > If I understand the problem correctly, why not just something like this?: > > SELECT t1.a, sum(t2.b), t1.c FROM abc t1 JOIN abc t2 ON t2.a <= t1.a AND > t2.c=t1.c GROUP BY t1.a, t1.c;
Oh... and also, A, B, C are in the same table. 2009/4/17 Shug Boabby <shug.boabby@gmail.com>: > Life sure would be easier if that were the case Michal, but no... that > is not the case here. The sum is not a simple sum, it is a sum of all > elements having a lower or equal A and the same C. This is a > "cumulative sum" as pointed out by others. > > 2009/4/15 Michal Politowski <mpol+pg@meep.pl>: >> On Wed, 15 Apr 2009 15:32:42 +0100, Shug Boabby wrote: >>> I simplified my problem a little too much and now I'm stuck trying to >>> use cumulative_sum(). My schema is not only A, B but also has a C >>> >>> A B C >>> 1 0 1 >>> 2 1 1 >>> 3 0 1 >>> 4 2 1 >>> 5 1 1 >>> 1 0 2 >>> 2 1 2 >>> 3 0 2 >>> 4 2 2 >>> 5 1 2 >>> >>> and I want to be able to do the cumulative sum only when C is the same. E.g. >>> >>> A funkySumB C >>> 1 0 1 >>> 2 1 1 >>> 3 1 1 >>> 4 3 1 >>> 5 4 1 >>> 1 0 2 >>> 2 1 2 >>> 3 1 2 >>> 4 3 2 >>> 5 4 2 >> >> If I understand the problem correctly, why not just something like this?: >> >> SELECT t1.a, sum(t2.b), t1.c FROM abc t1 JOIN abc t2 ON t2.a <= t1.a AND >> t2.c=t1.c GROUP BY t1.a, t1.c; >
Michal... I must apologise, your suggestion worked a treat!!! I never realised it was possible to do a join on a table to itself before! 2009/4/15 Michal Politowski <mpol+pg@meep.pl>: > On Wed, 15 Apr 2009 15:32:42 +0100, Shug Boabby wrote: >> I simplified my problem a little too much and now I'm stuck trying to >> use cumulative_sum(). My schema is not only A, B but also has a C >> >> A B C >> 1 0 1 >> 2 1 1 >> 3 0 1 >> 4 2 1 >> 5 1 1 >> 1 0 2 >> 2 1 2 >> 3 0 2 >> 4 2 2 >> 5 1 2 >> >> and I want to be able to do the cumulative sum only when C is the same. E.g. >> >> A funkySumB C >> 1 0 1 >> 2 1 1 >> 3 1 1 >> 4 3 1 >> 5 4 1 >> 1 0 2 >> 2 1 2 >> 3 1 2 >> 4 3 2 >> 5 4 2 > > If I understand the problem correctly, why not just something like this?: > > SELECT t1.a, sum(t2.b), t1.c FROM abc t1 JOIN abc t2 ON t2.a <= t1.a AND > t2.c=t1.c GROUP BY t1.a, t1.c; > > -- > Michał Politowski > Talking has been known to lead to communication if practiced carelessly. > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >