Thread: How to aggregates this data

How to aggregates this data

From
John Summerfield
Date:
<wince> Now I've figured why I don't get replies, I'm sending again.

Intro:
I have a table that contains much data like this: SELECT distinct on  (date) asx_code,date,open,high,low,close from
sales_summaries where asx_code =  'TLSCA' and date >= '2006-12-01' order
by date, time desc  limit 10;
SELECT distinct on  (date) asx_code,date,open,high,low,close,volume from
sales_summaries where asx_code =  'TLSCA' and date >= '2006-12-01' order
by date, time desc  limit 10; asx_code |    date    | open  | high  |  low  | close |  volume
----------+------------+-------+-------+-------+-------+---------- TLSCA    | 2006-12-01 | 2.330 | 2.340 | 2.300 |
2.300| 29165222 TLSCA    | 2006-12-04 | 2.310 | 2.320 | 2.280 | 2.300 | 14973667 TLSCA    | 2006-12-05 | 2.310 | 2.320
|2.300 | 2.320 | 23238175 TLSCA    | 2006-12-06 | 2.320 | 2.400 | 2.320 | 2.380 | 58357650 TLSCA    | 2006-12-07 |
2.380| 2.450 | 2.380 | 2.450 | 64005594 TLSCA    | 2006-12-08 | 2.450 | 2.520 | 2.450 | 2.510 | 82831560 TLSCA    |
2006-12-11| 2.500 | 2.550 | 2.500 | 2.540 | 38711749 TLSCA    | 2006-12-12 | 2.560 | 2.620 | 2.560 | 2.610 | 61709325
TLSCA   | 2006-12-13 | 2.600 | 2.600 | 2.540 | 2.590 | 39455386 TLSCA    | 2006-12-14 | 2.600 | 2.650 | 2.600 | 2.620 |
30675340
(10 rows)


Some may guess (correctly) that this is price information from a stock
exchange. What I want to do is consolidate the data into weekly (for
starters) periods. Writing a Java (or other procedural language) program
to read row by row is fairly straightforward. What I want to do is to do
it entirely in SQL.

I've perused my book (Mastering SQL by Martin Gruber), the postgresql
docs (I have versions here on RHEL (Centos) 4, FC5,6, Debian Testing -
up to 8,1) and I don't see how to choose the entry for the first open
column, the last close. Max, min and average will do nicely for the
others, and converting the date to an interval then dividing by seven
seems to work nicely for creating my groups,

The results I expect from the above data are
TLSCA 2006-12-07 2.330 2.450 2.280 2.450 mumble
TLSCA 2006-12-14 2.450 2.650 2.450 2.620 mumble

The question is, "How can I do this in SQL?"

May the pedants ignore that these weeks don't run Monday to Friday:-)

fwiw I'm working on Linux; thus far everything's done in bash scripts
aided by friends such as lynx, grep awk and (Of course) psql.



Re: How to aggregates this data

From
"Chad Wagner"
Date:
John,<br />   Sub-selects to the rescue.  See below.<br /><br /> select s1.asx_code, s1.bdate AS date, s1.low, s1.high,
s2.open,s3.close, s1.volume<br /> from (select asx_code, date_trunc('week', date) AS bdate, max(date) AS edate,
min(low)AS low, max(high) AS high, sum(volume) AS volume<br />       from sales_summaries<br />       group by
asx_code,date_trunc('week', date)) s1, sales_summaries s2, sales_summaries s3<br /> where s1.bdate = s2.date<br /> and
s1.asx_code=s2.asx_code<br/> and s1.edate = s3.date<br /> and s1.asx_code=s3.asx_code;<br /><br />   asx_code 
|         date          | low  | high | open | close |  volume   <br />
------------+------------------------+------+------+------+-------+-----------<br/>   TLSCA     | 2006-12-04
00:00:00-05| 2.28 | 2.52 | 2.31 |  2.51 | 243406646<br />   TLSCA     | 2006-12-11 00:00:00-05 |  2.5 | 2.65 |  2.5 | 
2.62| 170551800<br /><br /> The "date" is based on ISO-8601 (in other words the week starts on Monday).  Be warned, as
itis questionable how this will scale.  It may require expression (function based) indexes.<br /><br />Oracle has a
featurecalled analytic functions, which would allow you to use functions such as LEAD, LAG, FIRST_VALUE, LAST_VALUE. 
Inparticular FIRST_VALUE and LAST_VALUE would have been useful to determine the open and close for a week, but before
analyticsin Oracle you would use sub-selects or multiple joins.<br /><br /><br /> 

Re: How to aggregates this data

From
"Chad Wagner"
Date:
By the way, if you want to include that "incomplete" week before 12/1 (incomplete because it doesn't have a data point
forMonday) then you would do this:<br /><br /><br />select s1.asx_code, s1.wdate AS date, s1.low , s1.high, s2.open,
s3.close,s1.volume<br />from (select asx_code, date_trunc('week', date) AS wdate, min(date) AS bdate, max(date) AS
edate,min(low) AS low, max(high) AS high, SUM(volume) AS volume<br />      from sales_summaries <br />      group by
asx_code,date_trunc('week', date)) s1, sales_summaries s2, sales_summaries s3<br />where s1.bdate = s2.date<br />and
s1.asx_code=s2.asx_code<br/>and s1.edate = s3.date<br />and s1.asx_code=s3.asx_code ;<br /><br /><br /> 

Re: How to aggregates this data

From
Bruno Wolff III
Date:
On Thu, Jan 11, 2007 at 07:50:19 +0900, John Summerfield <postgres@herakles.homelinux.org> wrote:
> 
> I've perused my book (Mastering SQL by Martin Gruber), the postgresql
> docs (I have versions here on RHEL (Centos) 4, FC5,6, Debian Testing -
> up to 8,1) and I don't see how to choose the entry for the first open
> column, the last close. Max, min and average will do nicely for the
> others, and converting the date to an interval then dividing by seven
> seems to work nicely for creating my groups,
> 
> The results I expect from the above data are
> TLSCA 2006-12-07 2.330 2.450 2.280 2.450 mumble
> TLSCA 2006-12-14 2.450 2.650 2.450 2.620 mumble
> 
> The question is, "How can I do this in SQL?"

There are date functions that return the week of the year, so that you
could do a group by week and year to get your aggregates. To get the open
and close prices, you could join back to the table (twice) on week, year and
either the first or last day of the week. As long as there is data for every
day of the week for every stock, this should work.