Re: partial time stamp query - Mailing list pgsql-general

From Kirk Wythers
Subject Re: partial time stamp query
Date
Msg-id D54FF3CA-76DC-46F4-AEDA-B2CF50E981B0@umn.edu
Whole thread Raw
In response to Re: partial time stamp query  (Kirk Wythers <kirk.wythers@gmail.com>)
Responses Re: partial time stamp query  (Misa Simic <misa.simic@gmail.com>)
List pgsql-general
Hi Brent,=20

Nice to hear from you. I hope your world is good.=20

On Feb 4, 2013, at 2:14 PM, Brent Wood <Brent.Wood@niwa.co.nz> wrote:

> Hi Kirk,
>=20
> We have a (near) real time data database for instrument observations =
from our research vessels. All observations (summarised to one minute =
intervals - the actual raw data is in netCDF, this database makes for =
easier access & meets most users needs) go into a single table, with =
other tables for metadata about platforms, instruments, etc. Now =
approaching 350m records, so reasonably substantial.
>=20
> Underlying optimisations include
>=20
> partitioned readings table, with a separate partition for each year =
(now 23 years)
> clustered index on timestamp for the previous years partitions.
> largeish filesystem block size - tested to work well with the =
clustered index & small size records)
>=20
> These generally make a big difference to performance. To address one =
issue, much like yours, where some users want hourly data for a year, =
some want daily data for 10 years & some want 1 minute data for the last =
month (& some, no doubt, want one minute data for 20+ years!) I =
introduced an integer column called timer. This value is set according =
to the time (not date) of each record.

Very similar to what I need to do. Our main table consists of records =
that have been standardized to 15 minute timestamps. Here is a =
simplified example

record    timestamp        variable        value
1        12:00:00        temp        12.6
2        12:15:00        temp        12.3
3        12:30:00        temp        11.7
4        12:45:00        temp        12.3
5        13:00:00        temp        13.9
6        13:15:00        temp        12.5
7        13.30:00        temp        13.7
8        13:45:00        temp        12.0

You are exactly right, some people will want the original 15 minute =
version, some people will want these summarized to hourly data, and =
others will want these summarized to daily data. Still others may be =
satisfied with monthly summaries.=20

>=20
> Along the lines of (from memory) :an even no of minutes after the hour =
is 2, 5 minutes is 4, 10 minutes is 8, 15 minute is 16, 30 minutes is =
32, 60 minutes is 64, 6 hourly is 128, 12:00 AM is 256 & 12:00PM is 512. =
  When any timestamp is in more than one category (eg: 12:00 is all of =
even, 5, 15m 30m 60 minutes), the timer value is set to the largest =
appropriate one.

I'm not quite following. In my case, if I want hourly data, I'd be =
looking for=85

record    timestamp        variable        value
1        12:00:00         temp        12.225
2        13:00:00        temp        13.025

Are you saying that I could use an approach that WHILE statement? =
Something like:

WHILE data_truc('hour', timestamp) =3D 12:00:00, then calulate =
AVG(value)?

>=20
> So a request for:
> 1 minute data is select from table;
> 2 minute data is select from table where timer >=3D2 and timer !=3D15 =
and timer !=3D4;
> hourly data is select from table where timer >=3D64 and timer !=3D 15 =
and timer !=3D 4;
> etc
>=20
> 5 & 15 minute add a bit of complexity, but we gave the users what they =
wanted. This has worked well for years now, & we have an internal web =
(mapserver/openlayers based) application allowing users to visualise & =
download their selected data - they choose from an interval pick list & =
the SQL is hidden. Some extra enhancements are the automatic collation =
of lat & lon gps readings into a Postgis point for each reading record, =
& the automatic aggregation of daily points into daily track lines, so =
the track for any selected set of dates can easily be displayed on a map =
(the platforms are mobile vessels - not fixed sites)
>=20
> You might adapt some of these ideas for your use case?
>=20
> Cheers
>=20
> Brent Wood
>=20
> Programme leader: Environmental Information Delivery
> NIWA
> DDI:  +64 (4) 3860529
> ________________________________________
> From: pgsql-general-owner@postgresql.org =
[pgsql-general-owner@postgresql.org] on behalf of Kirk Wythers =
[wythe001@umn.edu]
> Sent: Tuesday, February 05, 2013 5:58 AM
> To: pgsql-general@postgresql.org
> Subject: Fwd: [GENERAL] partial time stamp query
>=20
> Thanks. That worked great! Now I am trying to aggregate these same =
fifteen minute to hourly. I have tried using date_trunk:
>=20
> date_trunc('hour', derived_tsoil_fifteen_min_stacked.time2),
>=20
> but date_truck only seems to aggriage the timestamp. I thought I could =
use
>=20
> AVG(derived_tsoil_fifteen_min_stacked.value)
>=20
> in combination with date_trunk, but I still get 15 minute values, not =
the hourly average from the four 15 minute records.
>=20
> rowid date_truck time2 site canopy plot variable name value avg
> 2010-07-07_00:00:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:00:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 21.06 21.0599994659424
> 2010-07-07_00:15:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:15:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 20.96 20.9599990844727
> 2010-07-07_00:30:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:30:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 20.88 20.8799991607666
> 2010-07-07_00:45:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:45:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 20.8 20.7999992370605
> 2010-07-07_01:00:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:00:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 20.72 20.7199993133545
> 2010-07-07_01:15:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:15:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 20.64 20.6399993896484
> 2010-07-07_01:30:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:30:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 20.55 20.5499992370605
> 2010-07-07_01:45:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:45:00 =
cfc closed a2 tsoil_sc tsoil_avg1_sc 20.47 20.4699993133545
>=20
> I was tying to get two records out of this set, with the 'avg" column =
representing the mean of the first and last four of each 15 minute =
records.
>=20
> Perhaps date_trunk only works for the timestamp?
>=20
>=20
>=20
> On Feb 4, 2013, at 8:50 AM, Misa Simic =
<misa.simic@gmail.com<mailto:misa.simic@gmail.com>> wrote:
>=20
> WHERE derived_tsoil_fifteen_min_stacked.time2::date =3D =
'2010-07-07'::date
>=20
> On Monday, February 4, 2013, Kirk Wythers wrote:
> I am trying to write a query that grabs one particular day from a =
timestamp column. The data are ordered in 15 minute chunks like this:
>=20
> 2010-07-07 12:45:00
> 2010-07-07 13:00:00
> 2010-07-07 13:15:00
> 2010-07-07 13:30:00
> etc=85
>=20
> If I wanted all records from july 7th 2010, I would expect 4 x 24 =3D =
96 records per day.
>=20
> I have tried the '=3D' operator, like this
>=20
> WHERE derived_tsoil_fifteen_min_stacked.time2 =3D '2010-07-07*'
>=20
> but that grabs nothing, and using the '~' operator grabs everything =
with a 2010 or 07 in it=85 in other words all days from July of 2010.
>=20
> Any suggestions would be much appreciated.
>=20
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>=20
>=20
> --
> Please consider the environment before printing this email.
> NIWA is the trading name of the National Institute of Water & =
Atmospheric Research Ltd.

pgsql-general by date:

Previous
From: Steve Crawford
Date:
Subject: Re: Is it possible to migrate database from Postgresql 8.2 to 9.2 using pg_upgrade?
Next
From: Moshe Jacobson
Date:
Subject: Passing dynamic parameters to a table-returning function