Re: Design Question (Time Series Data) - Mailing list pgsql-general

From Andreas Strasser
Subject Re: Design Question (Time Series Data)
Date
Msg-id fe304g$2vp8$1@news.hub.org
Whole thread Raw
In response to Re: Design Question (Time Series Data)  (Ted Byers <r.ted.byers@rogers.com>)
List pgsql-general
Ted Byers wrote:

> If you really have such a disparity among your series,
> then it is a mistake to blend them into a single
> table.  You really need to spend more time analyzing
> what the data means.  If one data set is comprised of
> the daily close price of a suite of stocks or mutual
> funds, then it makes sense to include all such series
> in a given table, but if some of the series are daily
> close price and others are monthly averages, then it
> is a mistake to combine them in a single table, and
> two or more would be warranted.  Or if the data are
> from different data feed vendors, then you have to
> think very carefully whether or not the data can
> logically be combined.

 > I don't much like any of the above.  When I have had
 > to process data for financial consultants, I applied a
 > few simple filters to ensure the data is clean (e.g.
 > tests to ensure data hasn't been corrupted during
 > transmission, proper handling of missing data, &c.),
 > and then bulk loaded the data into a suite of tables
 > designed specifically to match the vendor's
 > definitions of what the data means.  Only then did we
 > apply specific analyses designed in consultation with
 > the financial consultant's specialists; folk best
 > qualified to help us understand how best to understand
 > the data and especially how it can be combined in a
 > meaningful way.

Thank you for your long and detailed answer. It would be possible for me
to group those series by frequency or vendor (or even by size), but this
would make sql-queries really complicated. Grouping the series by theme
would probably work but be a real pain and i doubt whether this has a
real advantage for my application.

The real analysis of the data will happen on the client-side, the server
has only to ensure that the time series are up to date and not corrupted
- it doesn't care about the meaning of the value of any given field (I
was probably a little bit unclear about that). It will (with the help of
R - via PL/R) do a few simple calculations and transformations, but
those can be applied without much knowledge about the properties of the
data.

> I do not know if PostgreSQL, or any other RDBMS,
> includes the ability to call on software such as "R"
> to do specific statistical analysis, but if I had to
> do some time series analysis, I would do it in a
> client application that retrieves the appropriate data
> from the database and either does the analysis in
> custom code I have written (usually in C++, as some of
> my favourite analyses have not made it into commonly
> available open source or commercial statistical
> software) or invokes the appropriate functions from
> statistical software I have at my disposal.  The
> strategy I describe above makes the SQL required for
> much of this dirt simple.

Exactly, i'm looking for a simple and clean way to store data(possibly
transform its frequency) and serve it to the clients (in different
formats).

A typical request would be "Give me quarterly GDP and quarterly Import
figures for the US between 1987-01 to 2007-2 plus the effective federal
funds rate (transformed to quarterly averages) for the same period".

The server therefore has to grab the values for the first two series
(both are much longer - go back to the 1930's or 40's), extract the
values for the requested period (pad it if it's too short). It will do
the same for the third series and transform it to quarterly values (only
available for months/weeks). Finally it will return a table with a
date-column and 3 columns for the requested data.

Most of the stuff will be done in stored procedures, since i don't want
to mess around with it in php (which acts as as proxy for the clients).

Andreas

pgsql-general by date:

Previous
From: Ron Johnson
Date:
Subject: Re: good sql tutorial
Next
From: Geoffrey
Date:
Subject: Re: good sql tutorial