Thread: query help

query help

From
Kirk Wythers
Date:
I need some help with rewriting a query. I have a fairly complicated query (for me anyway) that dumps daily climate data, filling in missing data with monthly averages (one line per day).  

I want to output monthly averages (one line per month). I am having a hard time wrapping my head around this. Particularly how to deal with the doy column (day of year). I have tried several approaches and my forehead is starting to get my keyboard bloody. 

Thanks in advance for any suggestions.

Here is the daily query:

SELECT CASE
WHEN w.station_id = site_near.station_id THEN w.obs_id
ELSE s.obs_id
END AS obs_id,
site_near.station_id,
site_near.longname,
w.year,
w.doy,
--replace missing values (-999) with the monthly average
       CASE w.tmax
         WHEN -999 THEN avgtmax.avg
         ELSE w.tmax
       END,
CASE w.tmin
         WHEN -999 THEN avgtmin.avg
         ELSE w.tmin
       END,
CASE s.par
         WHEN -999 THEN avgpar.avg
         ELSE s.par
       END,
CASE w.precip
         WHEN -999 THEN avgprecip.avg
         ELSE w.precip
       END
FROM  site_near
       INNER JOIN solar s
         ON (site_near.ref_solar_station_id = s.station_id
             AND site_near.obs_year = s.year)
       INNER JOIN weather w
         ON (site_near.ref_weather_station_id = w.station_id
             AND site_near.obs_year = w.year
             AND s.date = w.date)
       INNER JOIN (SELECT   MONTH,
                            round(avg(tmax)::numeric, 2) AS avg
                   FROM     weather
                   WHERE    tmax != -999
                   GROUP BY MONTH) AS avgtmax
         ON (w.month = avgtmax.month)
INNER JOIN (SELECT   MONTH,
                            round(avg(tmin)::numeric, 2) AS avg
                   FROM     weather
                   WHERE    tmin != -999
                   GROUP BY MONTH) AS avgtmin
         ON (w.month = avgtmin.month)
       INNER JOIN (SELECT   MONTH,
                            round(avg(par)::numeric, 2) AS avg
                   FROM     solar
                   WHERE    par != -999
                   GROUP BY MONTH) AS avgpar
         ON (s.month = avgpar.month)
INNER JOIN (SELECT   MONTH,
                            round(avg(precip)::numeric, 2) AS avg
                   FROM     weather
                   WHERE    precip != -999
                   GROUP BY MONTH) AS avgprecip
         ON (w.month = avgprecip.month)
--select station to output climate data by id number
WHERE  w.station_id = 219101

Re: query help

From
"Josh Tolley"
Date:
On 8/14/07, Kirk Wythers <kwythers@umn.edu> wrote:
>
> I need some help with rewriting a query. I have a fairly complicated query
> (for me anyway) that dumps daily climate data, filling in missing data with
> monthly averages (one line per day).
>
> I want to output monthly averages (one line per month). I am having a hard
> time wrapping my head around this. Particularly how to deal with the doy
> column (day of year). I have tried several approaches and my forehead is
> starting to get my keyboard bloody.

I think this came up on IRC today, so perhaps this is only for the
archives' sake, but you want to do something like this:

Assuming you have a table as follows:

CREATE TABLE climate_data (
   measurement_time  timestamp,
   measurement_value  integer);

...and you insert data into it regularly, you can get the average
measurement over a period of time with date_trunc(), which will
truncate a date or timestamp value to match whatever precision you
specify. For example, see the following:

eggyknap=# select date_trunc('month', now());
       date_trunc
------------------------
 2007-08-01 00:00:00-06
(1 row)

Note: the -06 at the end means I'm in mountain time.

So if you want to get the average measurement over a month's time, you
need to do something like this:

SELECT DATE_TRUNC('MONTH', measurement_time), AVG(measurement_value)
FROM climate_data GROUP BY DATE_TRUNC('MONTH', measurement_time);

This will chop all the measurement_time values down to the month the
measurement was taken in, put all measurements in groups based on the
resulting value, and take the average measurement_value from each
group.

- Josh