Thread: Best way to do this query..

Best way to do this query..

From
"Henry Ortega"
Date:
I have the following:<br /><br />name      effective         tstamp                                   rate<br
/>John      01-01-2006     2005-12-07 13:39:07.614945    115.00<br />John       01-16-2006     2006-01-07
13:39:07.614945    125.00<br /> John       01-16-2006     2006-01-09 15:13:04.416935    1885.00<br /><br />I want the
outputto be:<br />name      effective         end_date        rate<br /> John       01-01-2006     01-15-2006   
115.00<br/> John       01-16-2006                         1885.00<br /><br />What is the best way to do this? This is
ona huge table and what I<br />have right now is quite slow. Any ideas?<br /><br /><br /> 

Re: Best way to do this query..

From
Andrew Sullivan
Date:
On Fri, Aug 25, 2006 at 04:16:07PM -0400, Henry Ortega wrote:
> I have the following:
> 
> name      effective         tstamp                                   rate
> John       01-01-2006     2005-12-07 13:39:07.614945    115.00
> John       01-16-2006     2006-01-07 13:39:07.614945    125.00
> John       01-16-2006     2006-01-09 15:13:04.416935    1885.00
> 
> I want the output to be:
> name      effective         end_date        rate
> John       01-01-2006     01-15-2006    115.00
> John       01-16-2006                         1885.00
> 
> What is the best way to do this? This is on a huge table and what I
> have right now is quite slow. Any ideas?

This is not tested, and it's just a sketch.  Seems like you need
another column, and probably a trigger that makes decisions on insert
time about whether the column is to be updated.  That column would be
an end_date column, default infinity.  When an insert comes along,
you DO ALSO update the old row's end_date with a new column.  Then
you do DISTINCT ON max(effective) and use a CASE statement to
suppress the infinity on the current rate.

Does that help?

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
Unfortunately reformatting the Internet is a little more painful 
than reformatting your hard drive when it gets out of whack.    --Scott Morris


Re: Best way to do this query..

From
Richard Broersma Jr
Date:
> I have the following:
> 
> name      effective         tstamp                                   rate
> John       01-01-2006     2005-12-07 13:39:07.614945    115.00
> John       01-16-2006     2006-01-07 13:39:07.614945    125.00
> John       01-16-2006     2006-01-09 15:13:04.416935    1885.00
> 
> I want the output to be:
> name      effective         end_date        rate
> John       01-01-2006     01-15-2006    115.00
> John       01-16-2006                         1885.00
> 
> What is the best way to do this? This is on a huge table and what I
> have right now is quite slow. Any ideas?

I would assume that your data does not change after it is logged.  If this is the case, maybe this
part of your data model would actually be a good canidate for an OLAP data model.

If this is not the direction your are enterested in moving,  you could also reduce your query
processing time by limiting the data ranges for your existing query to something for reasonable.

Regards,

Richard Broersma Jr.


Re: Best way to do this query..

From
"Henry Ortega"
Date:
Yes the data does not change once it is logged.
I am quite new to this whole thing, do you mind elaborating more
about the OLAP data model you mentioned about?

On 8/25/06, Richard Broersma Jr <rabroersma@yahoo.com> wrote:

I would assume that your data does not change after it is logged.  If this is the case, maybe this
part of your data model would actually be a good canidate for an OLAP data model.

If this is not the direction your are enterested in moving,  you could also reduce your query
processing time by limiting the data ranges for your existing query to something for reasonable.

Regards,

Richard Broersma Jr.

Re: Best way to do this query..

From
Richard Broersma Jr
Date:
> Yes the data does not change once it is logged.
> I am quite new to this whole thing, do you mind elaborating more
> about the OLAP data model you mentioned about?

Well just to give a generalization using an OLAP as your data model, the data is stored in a
denormalized yet defined model.  Data integrety is maintained because by definition all records
inserted will remain static.

Here are some additional links that can be used for getting started:

http://en.wikipedia.org/wiki/OLAP
http://www.amazon.com/gp/product/0123695120/sr=8-4/qid=1156546075/ref=pd_bbs_4/002-4041472-4877644?ie=UTF8


Another alternative is to create a materialized view that will update itself only when records are
inserted or updated.

http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html

Regards,

Richard Broersma Jr.