Thread: Screwy behavior with SUM and multiple joins to same table
Screwy behavior with SUM and multiple joins to same table: __START SCRIPT__ drop table product; create table product (prod varchar, name varchar); insert into product (prod, name) values ('A', 'Cat Food'); insert into product (prod, name) values ('B', 'Dog Food'); drop table sales; create table sales (prod varchar, store integer, units integer); insert into sales (prod, store, units) values ('A', 1, 50); insert into sales (prod, store, units) values ('A', 2, 100); \echo \echo cat food in store 1: select p.prod , sum(s.units) as store_1 from product p , sales s where p.prod = s.prod and store=1 and p.prod='A' group by p.prod; \echo \echo cat food in store 2: select p.prod , sum(s.units) as store_2 from product p , sales s where p.prod = s.prod and store=2 and p.prod='A' group by p.prod; \echo \echo cat food in stores 1 & 2: select p.prod , sum(s1.units) as store_1 , sum(s2.units) as store_2 , sum(sAll.units) as store_All from product p , sales s1 , sales s2 , sales sAll where p.prod = s1.prod and s1.store=1 and p.prod = s2.prod and s2.store=2 and p.prod = sAll.prod and sAll.store in (1,2) and p.prod='A' group by p.prod; \echo \echo cat food in stores 1 & 2 (sans products): select sum(s1.units) as store_1 , sum(s2.units) as store_2 , sum(sAll.units) as store_All from sales s1 , sales s2 , sales sAll where s1.store=1 and s1.prod = 'A' and s2.store=2 and s2.prod = 'A' and s2.store in (1,2) and sAll.prod = 'A' ; __END SCRIPT__ When run produces: animals=# \i animal_food.sql DROP CREATE INSERT 672972905 1 INSERT 672972906 1 DROP CREATE INSERT 672972934 1 INSERT 672972935 1 cat food in store 1: prod | store_1 ------+--------- A | 50 (1 row) cat food in store 2: prod | store_2 ------+--------- A | 100 (1 row) cat food in stores 1 & 2: prod | store_1 | store_2 | store_all ------+---------+---------+----------- A | 100 | 200 | 150 (1 row) cat food in stores 1 & 2 (sans products): store_1 | store_2 | store_all ---------+---------+----------- 100 | 200 | 150 (1 row) You can see that store totals for 1 and 2 are not reported correctly in the last two cases. I can see how this happens, however one would expect the engine to keep s1, and s2 figures separate from sAll figures. How do I get around this? Thanks -dl __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com
Stephan Szabo sszabo@bigpanda.com On Tue, 27 Aug 2002, David Link wrote: > Screwy behavior with SUM and multiple joins to same table: > > __START SCRIPT__ > > drop table product; > create table product (prod varchar, name varchar); > insert into product (prod, name) values ('A', 'Cat Food'); > insert into product (prod, name) values ('B', 'Dog Food'); > > drop table sales; > create table sales (prod varchar, store integer, units integer); > insert into sales (prod, store, units) values ('A', 1, 50); > insert into sales (prod, store, units) values ('A', 2, 100); > > > \echo > \echo cat food in store 1: > select p.prod > , sum(s.units) as store_1 > from product p > , sales s > where p.prod = s.prod and store=1 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in store 2: > select p.prod > , sum(s.units) as store_2 > from product p > , sales s > where p.prod = s.prod and store=2 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2: > select p.prod > , sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from product p > , sales s1 > , sales s2 > , sales sAll > where p.prod = s1.prod and s1.store=1 > and p.prod = s2.prod and s2.store=2 > and p.prod = sAll.prod and sAll.store in (1,2) > and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2 (sans products): > select sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from sales s1 > , sales s2 > , sales sAll > where s1.store=1 and s1.prod = 'A' > and s2.store=2 and s2.prod = 'A' > and s2.store in (1,2) and sAll.prod = 'A' > ; > In these last two the joins result in two rows. s1.units is 50 in each row, s2.units is 100 in each row. When you sum them you get 100 and 200. If you want the queries to be separate, you probably want subqueries in the general form select p.prod, (select sum(s1.units) from store_1 where s1.store=1 and s1.prod=p.prod), ... from product p where p.prod='A';
I think you have a conceptual problem here. Was happens is that the database prforms your joins first to produce a virtual table of 8 rows, it then applies your where lauses to chop that down and finally does the sum(). Of course, that is only the conceptual model. It's actually more intelligent than that but it gives you an idea of why you're are getting the results you are. Subqueries may be what you're looking for. This is how SQL works and any database will do the same. Hope this helps, On Tue, Aug 27, 2002 at 12:36:42PM -0700, David Link wrote: > Screwy behavior with SUM and multiple joins to same table: > > __START SCRIPT__ > > drop table product; > create table product (prod varchar, name varchar); > insert into product (prod, name) values ('A', 'Cat Food'); > insert into product (prod, name) values ('B', 'Dog Food'); > > drop table sales; > create table sales (prod varchar, store integer, units integer); > insert into sales (prod, store, units) values ('A', 1, 50); > insert into sales (prod, store, units) values ('A', 2, 100); > > > \echo > \echo cat food in store 1: > select p.prod > , sum(s.units) as store_1 > from product p > , sales s > where p.prod = s.prod and store=1 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in store 2: > select p.prod > , sum(s.units) as store_2 > from product p > , sales s > where p.prod = s.prod and store=2 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2: > select p.prod > , sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from product p > , sales s1 > , sales s2 > , sales sAll > where p.prod = s1.prod and s1.store=1 > and p.prod = s2.prod and s2.store=2 > and p.prod = sAll.prod and sAll.store in (1,2) > and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2 (sans products): > select sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from sales s1 > , sales s2 > , sales sAll > where s1.store=1 and s1.prod = 'A' > and s2.store=2 and s2.prod = 'A' > and s2.store in (1,2) and sAll.prod = 'A' > ; > > __END SCRIPT__ > > When run produces: > > animals=# \i animal_food.sql > DROP > CREATE > INSERT 672972905 1 > INSERT 672972906 1 > DROP > CREATE > INSERT 672972934 1 > INSERT 672972935 1 > > cat food in store 1: > prod | store_1 > ------+--------- > A | 50 > (1 row) > > > cat food in store 2: > prod | store_2 > ------+--------- > A | 100 > (1 row) > > > cat food in stores 1 & 2: > prod | store_1 | store_2 | store_all > ------+---------+---------+----------- > A | 100 | 200 | 150 > (1 row) > > > cat food in stores 1 & 2 (sans products): > store_1 | store_2 | store_all > ---------+---------+----------- > 100 | 200 | 150 > (1 row) > > > You can see that store totals for 1 and 2 are not reported correctly > in the last two cases. I can see how this happens, however one would > expect the engine to keep s1, and s2 figures separate from sAll > figures. > > How do I get around this? Thanks -dl > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Finance - Get real-time stock quotes > http://finance.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't.
Incidently, this behavior is exactly the same in MySQL and M$ SQLServer. Best I could do to resolve it is to create a complex sub-select in the FROM clause repeating much of the main SQL statement. I could also denormalize the table by having store=100 records for pre calculated sums. But still it strikes me as funny that this is the correct behavior. Or is it a curious side effect? --- David Link <dvlink@yahoo.com> wrote: > Screwy behavior with SUM and multiple joins to same table: > > __START SCRIPT__ > > drop table product; > create table product (prod varchar, name varchar); > insert into product (prod, name) values ('A', 'Cat Food'); > insert into product (prod, name) values ('B', 'Dog Food'); > > drop table sales; > create table sales (prod varchar, store integer, units integer); > insert into sales (prod, store, units) values ('A', 1, 50); > insert into sales (prod, store, units) values ('A', 2, 100); > > > \echo > \echo cat food in store 1: > select p.prod > , sum(s.units) as store_1 > from product p > , sales s > where p.prod = s.prod and store=1 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in store 2: > select p.prod > , sum(s.units) as store_2 > from product p > , sales s > where p.prod = s.prod and store=2 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2: > select p.prod > , sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from product p > , sales s1 > , sales s2 > , sales sAll > where p.prod = s1.prod and s1.store=1 > and p.prod = s2.prod and s2.store=2 > and p.prod = sAll.prod and sAll.store in (1,2) > and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2 (sans products): > select sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from sales s1 > , sales s2 > , sales sAll > where s1.store=1 and s1.prod = 'A' > and s2.store=2 and s2.prod = 'A' > and s2.store in (1,2) and sAll.prod = 'A' > ; > > __END SCRIPT__ > > When run produces: > > animals=# \i animal_food.sql > DROP > CREATE > INSERT 672972905 1 > INSERT 672972906 1 > DROP > CREATE > INSERT 672972934 1 > INSERT 672972935 1 > > cat food in store 1: > prod | store_1 > ------+--------- > A | 50 > (1 row) > > > cat food in store 2: > prod | store_2 > ------+--------- > A | 100 > (1 row) > > > cat food in stores 1 & 2: > prod | store_1 | store_2 | store_all > ------+---------+---------+----------- > A | 100 | 200 | 150 > (1 row) > > > cat food in stores 1 & 2 (sans products): > store_1 | store_2 | store_all > ---------+---------+----------- > 100 | 200 | 150 > (1 row) > > > You can see that store totals for 1 and 2 are not reported correctly > in the last two cases. I can see how this happens, however one would > expect the engine to keep s1, and s2 figures separate from sAll > figures. > > How do I get around this? Thanks -dl > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Finance - Get real-time stock quotes > http://finance.yahoo.com > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com
Re: Screwy behavior with SUM and multiple joins to same table
From
pgsql-gen@basebeans.com (pgsql-gen Newsgroup (@Basebeans.com))
Date:
Subject: Re: [GENERAL] Screwy behavior with SUM and multiple joins to same table From: David Link <dvlink@yahoo.com> === Incidently, this behavior is exactly the same in MySQL and M$ SQLServer. Best I could do to resolve it is to create a complex sub-select in the FROM clause repeating much of the main SQL statement. I could also denormalize the table by having store=100 records for pre calculated sums. But still it strikes me as funny that this is the correct behavior. Or is it a curious side effect? --- David Link <dvlink@yahoo.com> wrote: > Screwy behavior with SUM and multiple joins to same table: > > __START SCRIPT__ > > drop table product; > create table product (prod varchar, name varchar); > insert into product (prod, name) values ('A', 'Cat Food'); > insert into product (prod, name) values ('B', 'Dog Food'); > > drop table sales; > create table sales (prod varchar, store integer, units integer); > insert into sales (prod, store, units) values ('A', 1, 50); > insert into sales (prod, store, units) values ('A', 2, 100); > > > \echo > \echo cat food in store 1: > select p.prod > , sum(s.units) as store_1 > from product p > , sales s > where p.prod = s.prod and store=1 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in store 2: > select p.prod > , sum(s.units) as store_2 > from product p > , sales s > where p.prod = s.prod and store=2 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2: > select p.prod > , sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from product p > , sales s1 > , sales s2 > , sales sAll > where p.prod = s1.prod and s1.store=1 > and p.prod = s2.prod and s2.store=2 > and p.prod = sAll.prod and sAll.store in (1,2) > and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2 (sans products): > select sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from sales s1 > , sales s2 > , sales sAll > where s1.store=1 and s1.prod = 'A' > and s2.store=2 and s2.prod = 'A' > and s2.store in (1,2) and sAll.prod = 'A' > ; > > __END SCRIPT__ > > When run produces: > > animals=# \i animal_food.sql > DROP > CREATE > INSERT 672972905 1 > INSERT 672972906 1 > DROP > CREATE > INSERT 672972934 1 > INSERT 672972935 1 > > cat food in store 1: > prod | store_1 > ------+--------- > A | 50 > (1 row) > > > cat food in store 2: > prod | store_2 > ------+--------- > A | 100 > (1 row) > > > cat food in stores 1 & 2: > prod | store_1 | store_2 | store_all > ------+---------+---------+----------- > A | 100 | 200 | 150 > (1 row) > > > cat food in stores 1 & 2 (sans products): > store_1 | store_2 | store_all > ---------+---------+----------- > 100 | 200 | 150 > (1 row) > > > You can see that store totals for 1 and 2 are not reported correctly > in the last two cases. I can see how this happens, however one would > expect the engine to keep s1, and s2 figures separate from sAll > figures. > > How do I get around this? Thanks -dl > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Finance - Get real-time stock quotes > http://finance.yahoo.com > > ---------------------------(end of > broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
> > \echo > > \echo cat food in stores 1 & 2: > > select p.prod > > , sum(s1.units) as store_1 > > , sum(s2.units) as store_2 > > , sum(sAll.units) as store_All > > from product p > > , sales s1 > > , sales s2 > > , sales sAll > > where p.prod = s1.prod and s1.store=1 > > and p.prod = s2.prod and s2.store=2 > > and p.prod = sAll.prod and sAll.store in (1,2) > > and p.prod='A' > > group by p.prod; > > > > \echo > > \echo cat food in stores 1 & 2 (sans products): > > select sum(s1.units) as store_1 > > , sum(s2.units) as store_2 > > , sum(sAll.units) as store_All > > from sales s1 > > , sales s2 > > , sales sAll > > where s1.store=1 and s1.prod = 'A' > > and s2.store=2 and s2.prod = 'A' > > and s2.store in (1,2) and sAll.prod = 'A' > > ; > > > > In these last two the joins result in two rows. > s1.units is 50 in each row, s2.units is 100 in each > row. When you sum them you get 100 and 200. > > If you want the queries to be separate, you probably > want subqueries in the general form > select p.prod, (select sum(s1.units) from store_1 where s1.store=1 > and > s1.prod=p.prod), ... from product p where p.prod='A'; Sorry, I didn't see this earlier. Subquery in the SELECT Clause. I suppose. But then I have to repeat a bunch of logic for each store (the real problem has more than just two "stores"). I've created a subquery in the FROM Clause working as if it were a TEMP table. something like this: select sum(s1.units) as store_1 , sum(s2.units) as store_2 , sum(sAll.units) as store_All from sales s1 , sales s2 , (select prod, units from sales s where s.prod = 'A' and s.store in (1,2) ) as sAll where s1.store=1 and s1.prod = 'A' and s2.store=2 and s2.prod = 'A' and s1.prod = sAll.prod ; __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com
Re: Screwy behavior with SUM and multiple joins to same table
From
pgsql-gen@basebeans.com (pgsql-gen Newsgroup (@Basebeans.com))
Date:
Subject: Re: [GENERAL] Screwy behavior with SUM and multiple joins to same table From: Martijn van Oosterhout <kleptog@svana.org> === I think you have a conceptual problem here. Was happens is that the database prforms your joins first to produce a virtual table of 8 rows, it then applies your where lauses to chop that down and finally does the sum(). Of course, that is only the conceptual model. It's actually more intelligent than that but it gives you an idea of why you're are getting the results you are. Subqueries may be what you're looking for. This is how SQL works and any database will do the same. Hope this helps, On Tue, Aug 27, 2002 at 12:36:42PM -0700, David Link wrote: > Screwy behavior with SUM and multiple joins to same table: > > __START SCRIPT__ > > drop table product; > create table product (prod varchar, name varchar); > insert into product (prod, name) values ('A', 'Cat Food'); > insert into product (prod, name) values ('B', 'Dog Food'); > > drop table sales; > create table sales (prod varchar, store integer, units integer); > insert into sales (prod, store, units) values ('A', 1, 50); > insert into sales (prod, store, units) values ('A', 2, 100); > > > \echo > \echo cat food in store 1: > select p.prod > , sum(s.units) as store_1 > from product p > , sales s > where p.prod = s.prod and store=1 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in store 2: > select p.prod > , sum(s.units) as store_2 > from product p > , sales s > where p.prod = s.prod and store=2 and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2: > select p.prod > , sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from product p > , sales s1 > , sales s2 > , sales sAll > where p.prod = s1.prod and s1.store=1 > and p.prod = s2.prod and s2.store=2 > and p.prod = sAll.prod and sAll.store in (1,2) > and p.prod='A' > group by p.prod; > > \echo > \echo cat food in stores 1 & 2 (sans products): > select sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from sales s1 > , sales s2 > , sales sAll > where s1.store=1 and s1.prod = 'A' > and s2.store=2 and s2.prod = 'A' > and s2.store in (1,2) and sAll.prod = 'A' > ; > > __END SCRIPT__ > > When run produces: > > animals=# \i animal_food.sql > DROP > CREATE > INSERT 672972905 1 > INSERT 672972906 1 > DROP > CREATE > INSERT 672972934 1 > INSERT 672972935 1 > > cat food in store 1: > prod | store_1 > ------+--------- > A | 50 > (1 row) > > > cat food in store 2: > prod | store_2 > ------+--------- > A | 100 > (1 row) > > > cat food in stores 1 & 2: > prod | store_1 | store_2 | store_all > ------+---------+---------+----------- > A | 100 | 200 | 150 > (1 row) > > > cat food in stores 1 & 2 (sans products): > store_1 | store_2 | store_all > ---------+---------+----------- > 100 | 200 | 150 > (1 row) > > > You can see that store totals for 1 and 2 are not reported correctly > in the last two cases. I can see how this happens, however one would > expect the engine to keep s1, and s2 figures separate from sAll > figures. > > How do I get around this? Thanks -dl > > > __________________________________________________ > Do You Yahoo!? > Yahoo! Finance - Get real-time stock quotes > http://finance.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > There are 10 kinds of people in the world, those that can do binary > arithmetic and those that can't. ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> > If you want the queries to be separate, you probably > > want subqueries in the general form > > select p.prod, (select sum(s1.units) from store_1 where s1.store=1 > > and > > s1.prod=p.prod), ... from product p where p.prod='A'; > > Sorry, I didn't see this earlier. > Subquery in the SELECT Clause. I suppose. But then I have to repeat a > bunch of logic for each store (the real problem has more than just two > "stores"). > > I've created a subquery in the FROM Clause working as if it were a TEMP > table. something like this: > > select sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from sales s1 > , sales s2 > , (select prod, units > from sales s > where s.prod = 'A' > and s.store in (1,2) ) as sAll > where s1.store=1 and s1.prod = 'A' > and s2.store=2 and s2.prod = 'A' > and s1.prod = sAll.prod > ; Given the data you gave before, I don't believe this will work any better. The join and where still give 2 rows out. The first part from sales s1 where s1.store=1 and s1.prod='A' returns one row. The second part from sales s2 where s2.store=2 and s2.prod='A' returns one row. The third part from (select ...) as sAll where s1.prod=sAll.prod returns two rows. When you do the join, you end up with two rows out where the s1 and s2 parts get duplicated. Maybe something like: select sum(s1.units) as store_1 , sum(s2.units) as store_2 , sum(sAll.units) as store_All from sales s1 , sales s2 , (select prod, sum(units) from sales s where s.prod = 'A' and s.store in (1,2) group by s.prod) as sAll where s1.store=1 and s1.prod = 'A' and s2.store=2 and s2.prod = 'A' and s1.prod = sAll.prod That'll make the inner subselect give one row I think.
--- Stephan Szabo <sszabo@megazone23.bigpanda.com> wrote: > > Maybe something like: > select sum(s1.units) as store_1 > , sum(s2.units) as store_2 > , sum(sAll.units) as store_All > from sales s1 > , sales s2 > , (select prod, sum(units) > from sales s > where s.prod = 'A' > and s.store in (1,2) > group by s.prod) as sAll > where s1.store=1 and s1.prod = 'A' > and s2.store=2 and s2.prod = 'A' > and s1.prod = sAll.prod > > That'll make the inner subselect give > one row I think. > Yes, you are absolutely right: with the SUM() in the subselect to get back only one row and that's what I have. c/ , (select prod, sum(units) / , (select prod, sum(units) as units __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com