Thread: How to aggregates this data
<wince> Now I've figured why I don't get replies, I'm sending again. Intro: I have a table that contains much data like this: SELECT distinct on (date) asx_code,date,open,high,low,close from sales_summaries where asx_code = 'TLSCA' and date >= '2006-12-01' order by date, time desc limit 10; SELECT distinct on (date) asx_code,date,open,high,low,close,volume from sales_summaries where asx_code = 'TLSCA' and date >= '2006-12-01' order by date, time desc limit 10; asx_code | date | open | high | low | close | volume ----------+------------+-------+-------+-------+-------+---------- TLSCA | 2006-12-01 | 2.330 | 2.340 | 2.300 | 2.300| 29165222 TLSCA | 2006-12-04 | 2.310 | 2.320 | 2.280 | 2.300 | 14973667 TLSCA | 2006-12-05 | 2.310 | 2.320 |2.300 | 2.320 | 23238175 TLSCA | 2006-12-06 | 2.320 | 2.400 | 2.320 | 2.380 | 58357650 TLSCA | 2006-12-07 | 2.380| 2.450 | 2.380 | 2.450 | 64005594 TLSCA | 2006-12-08 | 2.450 | 2.520 | 2.450 | 2.510 | 82831560 TLSCA | 2006-12-11| 2.500 | 2.550 | 2.500 | 2.540 | 38711749 TLSCA | 2006-12-12 | 2.560 | 2.620 | 2.560 | 2.610 | 61709325 TLSCA | 2006-12-13 | 2.600 | 2.600 | 2.540 | 2.590 | 39455386 TLSCA | 2006-12-14 | 2.600 | 2.650 | 2.600 | 2.620 | 30675340 (10 rows) Some may guess (correctly) that this is price information from a stock exchange. What I want to do is consolidate the data into weekly (for starters) periods. Writing a Java (or other procedural language) program to read row by row is fairly straightforward. What I want to do is to do it entirely in SQL. I've perused my book (Mastering SQL by Martin Gruber), the postgresql docs (I have versions here on RHEL (Centos) 4, FC5,6, Debian Testing - up to 8,1) and I don't see how to choose the entry for the first open column, the last close. Max, min and average will do nicely for the others, and converting the date to an interval then dividing by seven seems to work nicely for creating my groups, The results I expect from the above data are TLSCA 2006-12-07 2.330 2.450 2.280 2.450 mumble TLSCA 2006-12-14 2.450 2.650 2.450 2.620 mumble The question is, "How can I do this in SQL?" May the pedants ignore that these weeks don't run Monday to Friday:-) fwiw I'm working on Linux; thus far everything's done in bash scripts aided by friends such as lynx, grep awk and (Of course) psql.
John,<br /> Sub-selects to the rescue. See below.<br /><br /> select s1.asx_code, s1.bdate AS date, s1.low, s1.high, s2.open,s3.close, s1.volume<br /> from (select asx_code, date_trunc('week', date) AS bdate, max(date) AS edate, min(low)AS low, max(high) AS high, sum(volume) AS volume<br /> from sales_summaries<br /> group by asx_code,date_trunc('week', date)) s1, sales_summaries s2, sales_summaries s3<br /> where s1.bdate = s2.date<br /> and s1.asx_code=s2.asx_code<br/> and s1.edate = s3.date<br /> and s1.asx_code=s3.asx_code;<br /><br /> asx_code | date | low | high | open | close | volume <br /> ------------+------------------------+------+------+------+-------+-----------<br/> TLSCA | 2006-12-04 00:00:00-05| 2.28 | 2.52 | 2.31 | 2.51 | 243406646<br /> TLSCA | 2006-12-11 00:00:00-05 | 2.5 | 2.65 | 2.5 | 2.62| 170551800<br /><br /> The "date" is based on ISO-8601 (in other words the week starts on Monday). Be warned, as itis questionable how this will scale. It may require expression (function based) indexes.<br /><br />Oracle has a featurecalled analytic functions, which would allow you to use functions such as LEAD, LAG, FIRST_VALUE, LAST_VALUE. Inparticular FIRST_VALUE and LAST_VALUE would have been useful to determine the open and close for a week, but before analyticsin Oracle you would use sub-selects or multiple joins.<br /><br /><br />
By the way, if you want to include that "incomplete" week before 12/1 (incomplete because it doesn't have a data point forMonday) then you would do this:<br /><br /><br />select s1.asx_code, s1.wdate AS date, s1.low , s1.high, s2.open, s3.close,s1.volume<br />from (select asx_code, date_trunc('week', date) AS wdate, min(date) AS bdate, max(date) AS edate,min(low) AS low, max(high) AS high, SUM(volume) AS volume<br /> from sales_summaries <br /> group by asx_code,date_trunc('week', date)) s1, sales_summaries s2, sales_summaries s3<br />where s1.bdate = s2.date<br />and s1.asx_code=s2.asx_code<br/>and s1.edate = s3.date<br />and s1.asx_code=s3.asx_code ;<br /><br /><br />
On Thu, Jan 11, 2007 at 07:50:19 +0900, John Summerfield <postgres@herakles.homelinux.org> wrote: > > I've perused my book (Mastering SQL by Martin Gruber), the postgresql > docs (I have versions here on RHEL (Centos) 4, FC5,6, Debian Testing - > up to 8,1) and I don't see how to choose the entry for the first open > column, the last close. Max, min and average will do nicely for the > others, and converting the date to an interval then dividing by seven > seems to work nicely for creating my groups, > > The results I expect from the above data are > TLSCA 2006-12-07 2.330 2.450 2.280 2.450 mumble > TLSCA 2006-12-14 2.450 2.650 2.450 2.620 mumble > > The question is, "How can I do this in SQL?" There are date functions that return the week of the year, so that you could do a group by week and year to get your aggregates. To get the open and close prices, you could join back to the table (twice) on week, year and either the first or last day of the week. As long as there is data for every day of the week for every stock, this should work.