I've been asked compute various moving averages of end of day sales by
store. I can do this for all rows with no problem (same query without
the WHERE clause towards the end). That query took 10-15 minutes to run
over approx 3.4 million rows. I'm sure they will want this information
to be added to the daily end of day reports. I can run the query below
(excluding the WHERE clause) but it takes almost as long to run one day
as it does the entire dataset. It looks like when I do the inner
select, it is still running over the entire dataset. I have added a
"WHERE eod_ts > CURRENT_TIMESTAMP - INTERVAL '365 days'" (as below) to
the inner query, which allows the query to run between 1-2 minutes.
Question 1) This seems to work, but was curious if there is a better way.
Question 2) Is there a way to specify a date, instead of using current
date and current_timestamp, as a variable and use that in the query? I
know I can do that in my Java program
using variables, but wasn't sure if there was a way to do this with a
function or stored procedure?
INSERT INTO historical_data_avg (store_id, date, avg7sales, avg14sales,
avg30sales, avg60sales, avg90sales, avg180sales) ( SELECT t1.store_id, t1.eod_ts, t1.avg5sales, t1.avg10sales,
t1.avg20sales, t1.avg50sales, t1.avg100sales, t1.avg180sales FROM ( SELECT store_id, eod_ts,
avg(eod_sales)OVER (PARTITION BY store_id ORDER BY eod_ts DESC
ROWS BETWEEN CURRENT ROW AND 4 FOLLOWING) AS avg5sales, avg(eod_sales) OVER (PARTITION BY store_id ORDER BY eod_ts
DESC
ROWS BETWEEN CURRENT ROW AND 9 FOLLOWING) AS avg10sales, avg(eod_sales) OVER (PARTITION BY store_id ORDER BY
eod_tsDESC
ROWS BETWEEN CURRENT ROW AND 19 FOLLOWING) AS avg20sales, avg(eod_sales) OVER (PARTITION BY store_id ORDER BY
eod_tsDESC
ROWS BETWEEN CURRENT ROW AND 49 FOLLOWING) AS avg50sales, avg(eod_sales) OVER (PARTITION BY store_id ORDER BY
eod_tsDESC
ROWS BETWEEN CURRENT ROW AND 99 FOLLOWING) AS avg100sales, avg(eod_sales) OVER (PARTITION BY store_id ORDER BY
eod_tsDESC
ROWS BETWEEN CURRENT ROW AND 179 FOLLOWING) AS avg200sales FROM end_of_day_data WHERE eod_ts > CURRENT_TIMESTAMP
-INTERVAL '260 days' GROUP BY store_id, eod_ts, eod_sales ORDER BY ticker_id, eod_ts ) as t1 WHERE t1.eod_ts =
current_date
);