Thread: Bucketing Row Data in columns

Bucketing Row Data in columns

From
Sandeep
Date:
Hi all,<br /> I need help on creating a sql, not a problem even if its pl/sql<br /><br /> I have orders table schema
isas follow<br /><br /> orders(order_id,user_id, create_timestamp, amount)<br /><br /> and I want to generate a report
like<br/> for the past 3 days bucketing purchases i.e SUM(amount) every day in columns<br /> i.e result will be having
thesecolumns.<br /><br /> (user_id, amount_day1, amount_day2, amount_day3)<br /><br /> ex:<br /> am leaving order_id
assumethey are auto incrementing and unique, date format dd/mm/yyyy<br /> (user_id, create_timestamp, amount)<br />
(user1,01/01/2009,100)<br /> (user1, 01/01/2009,100)<br /> (user2, 01/01/2009,100)<br /> (user2, 02/01/2009,100)<br />
(user2,02/01/2009,100)<br /> (user1, 02/01/2009,100)<br /> (user2, 03/01/2009,100)<br /> (user2, 03/01/2009,100)<br />
(user3,03/01/2009,100)<br /><br /><br /> result<br /><br /> (user_id, amount_day1, amount_day2, amount_day3)<br />
(user1,200, 200, 0)<br /> (user2, 100, 200, 200)<br /> (user3, 0, 0, 100)<br /><br /><br /> hope you guys got what I am
tryingto generate through sql.<br /><br /> I could get this data in each row, but I want it in columns.<br /> Can
anyonehelp me on this? lets assume the buckets are fixed i.e 3 only. but I wish to get them unlimited i.e day 1 to day
20.<br/><br /><table cellpadding="0" cellspacing="0" width="100%"><tr><td> Regards<br /> Sandeep Bandela
</td></tr></table>

Re: Bucketing Row Data in columns

From
James Kitambara
Date:
 
Hello Mr. Sandeep Bandela,
 
I have gone through your scenario and come up with the following solution.
 
SELECT USER_ID, CREATE_TIMESTAMP::DATE, SUM(AMOUNT)
FROM ORDERS
GROUP BY USER_ID, CREATE_TIMESTAMP
ORDER BY USER_ID, CREATE_TIMESTAMP;
 
Maybe you need to do little modification on the query to get what you want.
 
Best Regards
 
James Kitambara
Database Administrator

-------------------------------------ORGINAL MESSAGE------------------------------------------------
--- On Wed, 24/6/09, Sandeep <gibsosmat@gmail.com> wrote:

From: Sandeep <gibsosmat@gmail.com>
Subject: [SQL] Bucketing Row Data in columns
To: pgsql-sql@postgresql.org
Date: Wednesday, 24 June, 2009, 5:39 PM

Hi all,
I need help on creating a sql, not a problem even if its pl/sql

I have orders table schema is as follow

orders(order_id,user_id, create_timestamp, amount)

and I want to generate a report like
for the past 3 days bucketing purchases i.e SUM(amount) every day in columns
i.e result will be having these columns.

(user_id, amount_day1, amount_day2, amount_day3)

ex:
am leaving order_id assume they are auto incrementing and unique, date format dd/mm/yyyy
(user_id, create_timestamp, amount)
(user1, 01/01/2009,100)
(user1, 01/01/2009,100)
(user2, 01/01/2009,100)
(user2, 02/01/2009,100)
(user2, 02/01/2009,100)
(user1, 02/01/2009,100)
(user2, 03/01/2009,100)
(user2, 03/01/2009,100)
(user3, 03/01/2009,100)


result

(user_id, amount_day1, amount_day2, amount_day3)
(user1, 200, 200, 0)
(user2, 100, 200, 200)
(user3, 0, 0, 100)


hope you guys got what I am trying to generate through sql.

I could get this data in each row, but I want it in columns.
Can anyone help me on this? lets assume the buckets are fixed i.e 3 only. but I wish to get them unlimited i.e day 1 to day 20.

Regards
Sandeep Bandela

Re: Bucketing Row Data in columns

From
"Oliveiros Cristina"
Date:

Hello, Sandeep,
 
I am not sure if this is what you want.
 
I came up with this query
 
SELECT *
FROM (SELECT user_id,SUM(amount) as day1 FROM t1 WHERE create_timestamp = '2009-1-1' GROUP BY "user_id") a
NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day2 FROM t1 WHERE create_timestamp = '2009-1-2' GROUP BY "user_id") b
NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day3 FROM t1 WHERE create_timestamp = '2009-1-3' GROUP BY "user_id") c
 
The solution is not totally correct because it returns NULL in the places you return 0.
It seems the SUM() returns NULL when it gets an all NULL column...
Is it problematic for your application ?
 
Also, i 'm not sure if I fully understand your last sentence
lets assume the buckets are fixed i.e 3 only. but I wish to get them unlimited i.e day 1 to day 20.
 
You say that the buckets are fixed at 3. So, you mean the table output will always have 4 columns? 3 days plus one for user_id ?
If you want 20 buckets it must be a different query...
 
Could you please clarify what you mean when you say that you want to get a bucket unlimited ?
 
Best,
Oliveiros
----- Original Message -----
 
From: Sandeep
Sent: Wednesday, June 24, 2009 5:39 PM
Subject: [SQL] Bucketing Row Data in columns

Hi all,
I need help on creating a sql, not a problem even if its pl/sql

I have orders table schema is as follow

orders(order_id,user_id, create_timestamp, amount)

and I want to generate a report like
for the past 3 days bucketing purchases i.e SUM(amount) every day in columns
i.e result will be having these columns.

(user_id, amount_day1, amount_day2, amount_day3)

ex:
am leaving order_id assume they are auto incrementing and unique, date format dd/mm/yyyy
(user_id, create_timestamp, amount)
(user1, 01/01/2009,100)
(user1, 01/01/2009,100)
(user2, 01/01/2009,100)
(user2, 02/01/2009,100)
(user2, 02/01/2009,100)
(user1, 02/01/2009,100)
(user2, 03/01/2009,100)
(user2, 03/01/2009,100)
(user3, 03/01/2009,100)


result

(user_id, amount_day1, amount_day2, amount_day3)
(user1, 200, 200, 0)
(user2, 100, 200, 200)
(user3, 0, 0, 100)


hope you guys got what I am trying to generate through sql.

I could get this data in each row, but I want it in columns.
Can anyone help me on this? lets assume the buckets are fixed i.e 3 only. but I wish to get them unlimited i.e day 1 to day 20.

Regards
Sandeep Bandela

Re: Bucketing Row Data in columns

From
Rob Sargent
Date:
I would be suspicious of this sort of solution of turning rows into 
columns by mean of a series of correlated sub-selects.  Once the data 
set gets large and the number of columns goes over 2 or 3 this will in 
all likelihood not perform well. 
I had the pleasure of re-writing a "report" which was based on count() 
(similar to sum()) per user_id with the counts going into various 
columns per user.  18000 users, a dozen columns from table of 2 million 
rows, report took >1,000,000 seconds (yes almost 12 days) to complete.  
Re-write runs in 5-10 minutes (now at 10M rows) by getting the counts as 
rows (user, item, count) into a temp table and making the columns from 
the temp table (pl/psql)  Getting the counts takes half the time, making 
the flattened report takes half the time.




Oliveiros Cristina wrote:
> Hello, Sandeep,
>  
> I am not sure if this is what you want.
>  
> I came up with this query
>  
> SELECT *
> FROM (SELECT user_id,SUM(amount) as day1 FROM t1 WHERE 
> create_timestamp = '2009-1-1' GROUP BY "user_id") a
> NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day2 FROM t1 WHERE 
> create_timestamp = '2009-1-2' GROUP BY "user_id") b
> NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day3 FROM t1 WHERE 
> create_timestamp = '2009-1-3' GROUP BY "user_id") c
>  
> The solution is not totally correct because it returns NULL in the 
> places you return 0.
> It seems the SUM() returns NULL when it gets an all NULL column...
> Is it problematic for your application ?
>  
> Also, i 'm not sure if I fully understand your last sentence
> /lets assume the buckets are fixed i.e 3 only. but I wish to get them 
> unlimited i.e day 1 to day 20./
>  
> You say that the buckets are fixed at 3. So, you mean the table output 
> will always have 4 columns? 3 days plus one for user_id ?
> If you want 20 buckets it must be a different query...
>  
> Could you please clarify what you mean when you say that you want to 
> get a bucket unlimited ?
>  
> Best,
> Oliveiros
>
>     ----- Original Message -----
>      
>     *From:* Sandeep <mailto:gibsosmat@gmail.com>
>     *To:* pgsql-sql@postgresql.org <mailto:pgsql-sql@postgresql.org>
>     *Sent:* Wednesday, June 24, 2009 5:39 PM
>     *Subject:* [SQL] Bucketing Row Data in columns
>
>     Hi all,
>     I need help on creating a sql, not a problem even if its pl/sql
>
>     I have orders table schema is as follow
>
>     orders(order_id,user_id, create_timestamp, amount)
>
>     and I want to generate a report like
>     for the past 3 days bucketing purchases i.e SUM(amount) every day
>     in columns
>     i.e result will be having these columns.
>
>     (user_id, amount_day1, amount_day2, amount_day3)
>
>     ex:
>     am leaving order_id assume they are auto incrementing and unique,
>     date format dd/mm/yyyy
>     (user_id, create_timestamp, amount)
>     (user1, 01/01/2009,100)
>     (user1, 01/01/2009,100)
>     (user2, 01/01/2009,100)
>     (user2, 02/01/2009,100)
>     (user2, 02/01/2009,100)
>     (user1, 02/01/2009,100)
>     (user2, 03/01/2009,100)
>     (user2, 03/01/2009,100)
>     (user3, 03/01/2009,100)
>
>
>     result
>
>     (user_id, amount_day1, amount_day2, amount_day3)
>     (user1, 200, 200, 0)
>     (user2, 100, 200, 200)
>     (user3, 0, 0, 100)
>
>
>     hope you guys got what I am trying to generate through sql.
>
>     I could get this data in each row, but I want it in columns.
>     Can anyone help me on this? lets assume the buckets are fixed i.e
>     3 only. but I wish to get them unlimited i.e day 1 to day 20.
>
>     Regards
>     Sandeep Bandela
>



Re: Bucketing Row Data in columns

From
"Oliveiros Cristina"
Date:
I admit that must be a more elegant and faster solution with pl/psql (or 
whatever other languages)

As I don't know nothing about pl/psql I tried with pure sql (if you don't 
have a hunting dog, hunt with a cat)

But obviously this solution doesn't scale well if you have a giant table 
with lots of columns

----- Original Message ----- 
From: "Rob Sargent" <robjsargent@gmail.com>
To: <pgsql-sql@postgresql.org>
Sent: Thursday, June 25, 2009 4:57 PM
Subject: Re: [SQL] Bucketing Row Data in columns


>I would be suspicious of this sort of solution of turning rows into columns 
>by mean of a series of correlated sub-selects.  Once the data set gets 
>large and the number of columns goes over 2 or 3 this will in all 
>likelihood not perform well. I had the pleasure of re-writing a "report" 
>which was based on count() (similar to sum()) per user_id with the counts 
>going into various columns per user.  18000 users, a dozen columns from 
>table of 2 million rows, report took >1,000,000 seconds (yes almost 12 
>days) to complete.  Re-write runs in 5-10 minutes (now at 10M rows) by 
>getting the counts as rows (user, item, count) into a temp table and making 
>the columns from the temp table (pl/psql)  Getting the counts takes half 
>the time, making the flattened report takes half the time.
>
>
>
>
> Oliveiros Cristina wrote:
>> Hello, Sandeep,
>>  I am not sure if this is what you want.
>>  I came up with this query
>>  SELECT *
>> FROM (SELECT user_id,SUM(amount) as day1 FROM t1 WHERE create_timestamp = 
>> '2009-1-1' GROUP BY "user_id") a
>> NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day2 FROM t1 WHERE 
>> create_timestamp = '2009-1-2' GROUP BY "user_id") b
>> NATURAL FULL JOIN (SELECT user_id,SUM(amount) as day3 FROM t1 WHERE 
>> create_timestamp = '2009-1-3' GROUP BY "user_id") c
>>  The solution is not totally correct because it returns NULL in the 
>> places you return 0.
>> It seems the SUM() returns NULL when it gets an all NULL column...
>> Is it problematic for your application ?
>>  Also, i 'm not sure if I fully understand your last sentence
>> /lets assume the buckets are fixed i.e 3 only. but I wish to get them 
>> unlimited i.e day 1 to day 20./
>>  You say that the buckets are fixed at 3. So, you mean the table output 
>> will always have 4 columns? 3 days plus one for user_id ?
>> If you want 20 buckets it must be a different query...
>>  Could you please clarify what you mean when you say that you want to get 
>> a bucket unlimited ?
>>  Best,
>> Oliveiros
>>
>>     ----- Original Message -----
>>      *From:* Sandeep <mailto:gibsosmat@gmail.com>
>>     *To:* pgsql-sql@postgresql.org <mailto:pgsql-sql@postgresql.org>
>>     *Sent:* Wednesday, June 24, 2009 5:39 PM
>>     *Subject:* [SQL] Bucketing Row Data in columns
>>
>>     Hi all,
>>     I need help on creating a sql, not a problem even if its pl/sql
>>
>>     I have orders table schema is as follow
>>
>>     orders(order_id,user_id, create_timestamp, amount)
>>
>>     and I want to generate a report like
>>     for the past 3 days bucketing purchases i.e SUM(amount) every day
>>     in columns
>>     i.e result will be having these columns.
>>
>>     (user_id, amount_day1, amount_day2, amount_day3)
>>
>>     ex:
>>     am leaving order_id assume they are auto incrementing and unique,
>>     date format dd/mm/yyyy
>>     (user_id, create_timestamp, amount)
>>     (user1, 01/01/2009,100)
>>     (user1, 01/01/2009,100)
>>     (user2, 01/01/2009,100)
>>     (user2, 02/01/2009,100)
>>     (user2, 02/01/2009,100)
>>     (user1, 02/01/2009,100)
>>     (user2, 03/01/2009,100)
>>     (user2, 03/01/2009,100)
>>     (user3, 03/01/2009,100)
>>
>>
>>     result
>>
>>     (user_id, amount_day1, amount_day2, amount_day3)
>>     (user1, 200, 200, 0)
>>     (user2, 100, 200, 200)
>>     (user3, 0, 0, 100)
>>
>>
>>     hope you guys got what I am trying to generate through sql.
>>
>>     I could get this data in each row, but I want it in columns.
>>     Can anyone help me on this? lets assume the buckets are fixed i.e
>>     3 only. but I wish to get them unlimited i.e day 1 to day 20.
>>
>>     Regards
>>     Sandeep Bandela
>>
>
>
> -- 
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql 



Re: Bucketing Row Data in columns

From
Osvaldo Kussama
Date:
2009/6/25 James Kitambara <jameskitambara@yahoo.co.uk>:
>
> Hello Mr. Sandeep Bandela,
>
> I have gone through your scenario and come up with the following solution.
>
> SELECT USER_ID, CREATE_TIMESTAMP::DATE, SUM(AMOUNT)
> FROM ORDERS
> GROUP BY USER_ID, CREATE_TIMESTAMP
> ORDER BY USER_ID, CREATE_TIMESTAMP;
>
> Maybe you need to do little modification on the query to get what you want.
>


Contrib tablefunc/crosstab function may help you.
http://www.postgresql.org/docs/current/interactive/tablefunc.html

Osvaldo


Re: Bucketing Row Data in columns

From
bricklen
Date:
Assuming you know your dates beforehand, you could try a CASE
statement. Something like:
select order_id, sum(case when timestamp::date = 01/01/2009'' then amount else 0 end)
as amount_day1, sum(case when timestamp::date = '02/01/2009' then amount else 0 end)
as amount_day2, sum(case when timestamp::date = '03/01/2009' then amount else 0 end)
as amount_day3
from orders
group by order_id

On Wed, Jun 24, 2009 at 9:39 AM, Sandeep<gibsosmat@gmail.com> wrote:
> Hi all,
> I need help on creating a sql, not a problem even if its pl/sql
>
> I have orders table schema is as follow
>
> orders(order_id,user_id, create_timestamp, amount)
>
> and I want to generate a report like
> for the past 3 days bucketing purchases i.e SUM(amount) every day in columns
> i.e result will be having these columns.
>
> (user_id, amount_day1, amount_day2, amount_day3)
>
> ex:
> am leaving order_id assume they are auto incrementing and unique, date
> format dd/mm/yyyy
> (user_id, create_timestamp, amount)
> (user1, 01/01/2009,100)
> (user1, 01/01/2009,100)
> (user2, 01/01/2009,100)
> (user2, 02/01/2009,100)
> (user2, 02/01/2009,100)
> (user1, 02/01/2009,100)
> (user2, 03/01/2009,100)
> (user2, 03/01/2009,100)
> (user3, 03/01/2009,100)
>
>
> result
>
> (user_id, amount_day1, amount_day2, amount_day3)
> (user1, 200, 200, 0)
> (user2, 100, 200, 200)
> (user3, 0, 0, 100)
>
>
> hope you guys got what I am trying to generate through sql.
>
> I could get this data in each row, but I want it in columns.
> Can anyone help me on this? lets assume the buckets are fixed i.e 3 only.
> but I wish to get them unlimited i.e day 1 to day 20.
>
> Regards
> Sandeep Bandela