Home > mailing lists

Re: Optimal time series sampling. - Mailing list pgsql-general

From	Gregory Stark
Subject	Re: Optimal time series sampling.
Date	November 8, 2007 19:27:07
Msg-id	87lk98pbcv.fsf@oxford.xeocode.com Whole thread Raw
In response to	Optimal time series sampling. (Ted Byers <r.ted.byers@rogers.com>)
Responses	Re: Optimal time series sampling. Enabling password complexity for password authentication
List	pgsql-general

Tree view

"Ted Byers" <r.ted.byers@rogers.com> writes:

> As a prelude to where I really want to go, please
> consider the following SELECT statement.
>
>   SELECT close_price FROM stockprices A
>   WHERE price_date =
>      (SELECT MAX(price_date) FROM stockprices B
>       WHERE A.stock_id = B.stock_id AND A.stock_id = id);

I assume you're missing another "stock_id = id" on the outer query?

I think you'll have to post the actual explain analyze output you're getting
and the precise schema you have. You might need an index on
<stock_id,price_date>.

> It appears to do the right thing.  I certainly get the
> right answer, but I am not seeing a significant
> difference in performance.  Worse, when I invoke
> something like it for a suite of about two dozen
> stocks, it takes about ten minutes to complete.

That would be an entirely different ball of wax than trying to pull out a
single stock's closing price. I suspect you're going to want to use Postgres's
"DISTINCT ON" SQL extension. Something like:

SELECT DISTINCT ON (stock_id,price_date) *
  FROM stockprices
 ORDER BY stock_id, price_date DESC

And you may want an index on < stock_id, price_date DESC >

I believe MySQL does have a similar extension where you can use GROUP BY and
have columns listed in the select target list which aren't included in the
grouping sets.

> So I need a more complex select statement that will just select the most
> recent price for a given stock for each week (or month or quarter or year).

Do you care what happens if there were no trades for a given stock in the time
period? The query you give above using MAX would still work but the query I
described using DISTINCT ON would not emit a record for the stock at all.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

pgsql-general by date:

From: Christian Schröder
Date: 08 November 2007, 19:03:06
Subject: Re: (Never?) Kill Postmaster?

From: smiley2211
Date: 08 November 2007, 20:16:30
Subject: pg_ctl & show all

Re: Optimal time series sampling. - Mailing list pgsql-general

Previous

Next