Re: Optimal time series sampling. - Mailing list pgsql-general

From Gregory Stark
Subject Re: Optimal time series sampling.
Date
Msg-id 87lk98pbcv.fsf@oxford.xeocode.com
Whole thread Raw
In response to Optimal time series sampling.  (Ted Byers <r.ted.byers@rogers.com>)
Responses Re: Optimal time series sampling.  (Ted Byers <r.ted.byers@rogers.com>)
Enabling password complexity for password authentication  (paul rivers <rivers.paul@gmail.com>)
List pgsql-general
"Ted Byers" <r.ted.byers@rogers.com> writes:

> As a prelude to where I really want to go, please
> consider the following SELECT statement.
>
>   SELECT close_price FROM stockprices A
>   WHERE price_date =
>      (SELECT MAX(price_date) FROM stockprices B
>       WHERE A.stock_id = B.stock_id AND A.stock_id = id);

I assume you're missing another "stock_id = id" on the outer query?

I think you'll have to post the actual explain analyze output you're getting
and the precise schema you have. You might need an index on
<stock_id,price_date>.

> It appears to do the right thing.  I certainly get the
> right answer, but I am not seeing a significant
> difference in performance.  Worse, when I invoke
> something like it for a suite of about two dozen
> stocks, it takes about ten minutes to complete.

That would be an entirely different ball of wax than trying to pull out a
single stock's closing price. I suspect you're going to want to use Postgres's
"DISTINCT ON" SQL extension. Something like:

SELECT DISTINCT ON (stock_id,price_date) *
  FROM stockprices
 ORDER BY stock_id, price_date DESC

And you may want an index on < stock_id, price_date DESC >

I believe MySQL does have a similar extension where you can use GROUP BY and
have columns listed in the select target list which aren't included in the
grouping sets.

> So I need a more complex select statement that will just select the most
> recent price for a given stock for each week (or month or quarter or year).

Do you care what happens if there were no trades for a given stock in the time
period? The query you give above using MAX would still work but the query I
described using DISTINCT ON would not emit a record for the stock at all.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

pgsql-general by date:

Previous
From: Christian Schröder
Date:
Subject: Re: (Never?) Kill Postmaster?
Next
From: smiley2211
Date:
Subject: pg_ctl & show all