Thread: Timestamp/hstore query?

Timestamp/hstore query?

From
Brent Wood
Date:
Hi,


I have a database with instrument readings stored in hstore key-value pairs, so a record has a timestamp attribute and an hstore attribute with all the sensor readings for that time. The key identifies the sensor, the value is the reading.

Not all sensors have a reading at every timestamp.
Timestamps are 1 sec intervals.
The db uses Timescaledb hypertables to partition the data

How can I select the LAST (hstore) value for each instrument (hstore key) at one minute intervals. So I get an output of 1 minute timestamps & an hstore containing the last key-value pairs for each key in present in that 1 minute interval.

Essentially this provides a snapshot saying "at this time (1 min interval) the latest (current) reading for each sensor. was this value"

I'm OK with a user defined pl/pgsql function if this is the best approach.


Thanks

Brent Wood

Principal Technician, Fisheries
NIWA
DDI:  +64 (4) 3860529
Brent Wood
Principal Technician - GIS and Spatial Data Management
Programme Leader - Environmental Information Delivery
+64-4-386-0529
National Institute of Water & Atmospheric Research Ltd (NIWA)
301 Evans Bay Parade Hataitai Wellington New Zealand
Connect with NIWA: niwa.co.nz Facebook LinkedIn Twitter Instagram
To ensure compliance with legal requirements and to maintain cyber security standards, NIWA's IT systems are subject to ongoing monitoring, activity logging and auditing. This monitoring and auditing service may be provided by third parties. Such third parties can access information transmitted to, processed by and stored on NIWA's IT systems

Re: Timestamp/hstore query?

From
Michael Lewis
Date:
If you share example schema and desired output (like a dummy table or even pseudo code SQL), then I'm sure many people could help you. Right now, the description of your desired result seems a bit unclear, at least to me.


If you wanted to run this hourly for the last 1 hour, it sounds a bit like want this-

select sensor_id, date_trunc( 'minute', timestamptz_field_name_here ), last( value_from_hstore ) over ( partition by sensor_id, date_trunc( 'minute', timestamptz_field_name_here ) ) as last_value_recorded
from data_table_here
where timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
group by sensor_id, date_trunc( 'minute', timestamptz_field_name_here )


You could also use the left join "where is null" pattern to check that a prior record in the minute period of time does not exist for that same key. Something like this-

select d1.sensor_id, date_trunc( 'minute', d1.timestamptz_field_name_here ), d1.timestamptz_field_name_here as last_value_recorded
from data_table_here as d1
left join data_table_here as prior_d1 ON prior_d1.sensor_id = d1.sensor_id AND prior_d1.timestamptz_field_name_here < d1.timestamptz_field_name_here and prior_d1.timestamptz_field_name_here >= date_trunc( 'minute', d1.timestamptz_field_name_here )
where d1.timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()

Re: Timestamp/hstore query?

From
Brent Wood
Date:
Thanks for your reply,

The table is essentially:
create table readings (timer    timestamp primary key,
                       readings  hstore);

the hstore comprises (<sensor_id> <reading>) key/value pairs for readings taken at the time specified in the timestamp.

eg:  "67" "-45.67436", "68" "176.5424" could be key/value pairs representing latitude & longitude, with a timestamp in the timer column.

There would be several lat/lon hstore pairs in a given minute, the query I want would return the last one in the timeseries for that minute (for each key).

I don't think your examples will give me the separate hstore key-value pairs, extracted as the last in the interval for each key & reassembled as an hstore list in the result. The sensor id is the hstore key, as described above, not a separate attribute. That said, the keys can be extracted from the hstore much like a column, but I'm not sure that is the best approach.

Treating each set of hstores in an interval as an array & extracting the last elements may be viable. But I['m not sure how...



On Thursday, April 15, 2021, 6:33:08 AM GMT+12, Michael Lewis <mlewis@entrata.com> wrote:





If you share example schema and desired output (like a dummy table or even pseudo code SQL), then I'm sure many people could help you. Right now, the description of your desired result seems a bit unclear, at least to me.


If you wanted to run this hourly for the last 1 hour, it sounds a bit like want this-

select sensor_id, date_trunc( 'minute', timestamptz_field_name_here ), last( value_from_hstore ) over ( partition by sensor_id, date_trunc( 'minute', timestamptz_field_name_here ) ) as last_value_recorded
from data_table_here
where timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
group by sensor_id, date_trunc( 'minute', timestamptz_field_name_here )


You could also use the left join "where is null" pattern to check that a prior record in the minute period of time does not exist for that same key. Something like this-

select d1.sensor_id, date_trunc( 'minute', d1.timestamptz_field_name_here ), d1.timestamptz_field_name_here as last_value_recorded
from data_table_here as d1
left join data_table_here as prior_d1 ON prior_d1.sensor_id = d1.sensor_id AND prior_d1.timestamptz_field_name_here < d1.timestamptz_field_name_here and prior_d1.timestamptz_field_name_here >= date_trunc( 'minute', d1.timestamptz_field_name_here )
where d1.timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()


>  


Re: Timestamp/hstore query?

From
Mike Rylander
Date:
On Wed, Apr 14, 2021 at 5:33 PM Brent Wood <pcreso@yahoo.com> wrote:
>
> Thanks for your reply,
>
> The table is essentially:
> create table readings (timer    timestamp primary key,
>                        readings  hstore);
>
> the hstore comprises (<sensor_id> <reading>) key/value pairs for readings taken at the time specified in the
timestamp.
>
> eg:  "67" "-45.67436", "68" "176.5424" could be key/value pairs representing latitude & longitude, with a timestamp
inthe timer column. 
>
> There would be several lat/lon hstore pairs in a given minute, the query I want would return the last one in the
timeseriesfor that minute (for each key). 
>
> I don't think your examples will give me the separate hstore key-value pairs, extracted as the last in the interval
foreach key & reassembled as an hstore list in the result. The sensor id is the hstore key, as described above, not a
separateattribute. That said, the keys can be extracted from the hstore much like a column, but I'm not sure that is
thebest approach. 
>
> Treating each set of hstores in an interval as an array & extracting the last elements may be viable. But I['m not
surehow... 
>

Hi Brent,

With the table structure like that, you'll need to project all the
sensor data onto each timestamp and then collapse it.  Something like
this:

  SELECT date_trunc('minute',timer), key, first(value) over (partition
by key order by timer desc) FROM (SELECT timer, (each(readings)).*
FROM table WHERE date_trunc('minute',timer) = '2021-04-15
08:10:00-04')x;

If your queries will always target a specific timestamp truncated to
the minute, you'll find an index on date_trunc('minute',timer) will be
useful, I think.  If your intervals are more complex than date_trunc()
can handle then you'll have to get more creative, but that's the basic
structure -- normalize the primary key, project the sensor data to
make it table-ish, then use regular SQL to pull out what you want.

--
Mike Rylander
 | Research and Development Manager
 | Equinox Open Library Initiative
 | phone:  1-877-OPEN-ILS (673-6457)
 | email:  miker@equinoxOLI.org
 | web:  https://equinoxOLI.org

>
>
> On Thursday, April 15, 2021, 6:33:08 AM GMT+12, Michael Lewis <mlewis@entrata.com> wrote:
>
>
>
>
>
> If you share example schema and desired output (like a dummy table or even pseudo code SQL), then I'm sure many
peoplecould help you. Right now, the description of your desired result seems a bit unclear, at least to me. 
>
>
> If you wanted to run this hourly for the last 1 hour, it sounds a bit like want this-
>
> select sensor_id, date_trunc( 'minute', timestamptz_field_name_here ), last( value_from_hstore ) over ( partition by
sensor_id,date_trunc( 'minute', timestamptz_field_name_here ) ) as last_value_recorded 
> from data_table_here
> where timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
> group by sensor_id, date_trunc( 'minute', timestamptz_field_name_here )
>
>
> You could also use the left join "where is null" pattern to check that a prior record in the minute period of time
doesnot exist for that same key. Something like this- 
>
> select d1.sensor_id, date_trunc( 'minute', d1.timestamptz_field_name_here ), d1.timestamptz_field_name_here as
last_value_recorded
> from data_table_here as d1
> left join data_table_here as prior_d1 ON prior_d1.sensor_id = d1.sensor_id AND prior_d1.timestamptz_field_name_here <
d1.timestamptz_field_name_hereand prior_d1.timestamptz_field_name_here >= date_trunc( 'minute',
d1.timestamptz_field_name_here) 
> where d1.timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
>
>
> >
>
>



Re: Timestamp/hstore query?

From
Brent Wood
Date:
Thanks Mike,

That approach seems to work very well!!!

There are something on the order of 40 billion readings in the table. This returns an arbitrary one minutes data, as required, in under 30ms:

SELECT distinct date_trunc('minute',_timer) as timer,
                                key::int,
                                first_value(value)
                         OVER (partition by key order by _timer desc)
                         FROM (SELECT timer as _timer,
                                      (each(v.values)).*
                               FROM v_sec v
                               WHERE timer between '2020-04-15 08:10:00'
                                               and '2020-04-15 08:10:59' ) as x
                         order by key::int;


v_sec is a view on a table with 30+ years of 1 second readings from dozens of sensors, columns are timer (timestamp) & values (hstore).

I used "between timestamps" rather than "= date_trunc()" in the where clause as this uses the index on timestamps, so is MUCH faster.

FYI, we are using Timescaledb to manage the underlying data and data structures.



Appreciated!!!

Brent Wood


On Friday, April 16, 2021, 12:28:41 AM GMT+12, Mike Rylander <mrylander@gmail.com> wrote:


On Wed, Apr 14, 2021 at 5:33 PM Brent Wood <pcreso@yahoo.com> wrote:
>
> Thanks for your reply,
>
> The table is essentially:
> create table readings (timer    timestamp primary key,
>                        readings  hstore);
>
> the hstore comprises (<sensor_id> <reading>) key/value pairs for readings taken at the time specified in the timestamp.
>
> eg:  "67" "-45.67436", "68" "176.5424" could be key/value pairs representing latitude & longitude, with a timestamp in the timer column.
>
> There would be several lat/lon hstore pairs in a given minute, the query I want would return the last one in the timeseries for that minute (for each key).
>
> I don't think your examples will give me the separate hstore key-value pairs, extracted as the last in the interval for each key & reassembled as an hstore list in the result. The sensor id is the hstore key, as described above, not a separate attribute. That said, the keys can be extracted from the hstore much like a column, but I'm not sure that is the best approach.
>
> Treating each set of hstores in an interval as an array & extracting the last elements may be viable. But I['m not sure how...
>

Hi Brent,

With the table structure like that, you'll need to project all the
sensor data onto each timestamp and then collapse it.  Something like
this:

  SELECT date_trunc('minute',timer), key, first(value) over (partition
by key order by timer desc) FROM (SELECT timer, (each(readings)).*
FROM table WHERE date_trunc('minute',timer) = '2021-04-15
08:10:00-04')x;

If your queries will always target a specific timestamp truncated to
the minute, you'll find an index on date_trunc('minute',timer) will be
useful, I think.  If your intervals are more complex than date_trunc()
can handle then you'll have to get more creative, but that's the basic
structure -- normalize the primary key, project the sensor data to
make it table-ish, then use regular SQL to pull out what you want.

--
Mike Rylander
| Research and Development Manager
| Equinox Open Library Initiative
| phone:  1-877-OPEN-ILS (673-6457)
| email:  miker@equinoxOLI.org
| web:  https://equinoxOLI.org


>
>
> On Thursday, April 15, 2021, 6:33:08 AM GMT+12, Michael Lewis <mlewis@entrata.com> wrote:
>
>
>
>
>
> If you share example schema and desired output (like a dummy table or even pseudo code SQL), then I'm sure many people could help you. Right now, the description of your desired result seems a bit unclear, at least to me.
>
>
> If you wanted to run this hourly for the last 1 hour, it sounds a bit like want this-
>
> select sensor_id, date_trunc( 'minute', timestamptz_field_name_here ), last( value_from_hstore ) over ( partition by sensor_id, date_trunc( 'minute', timestamptz_field_name_here ) ) as last_value_recorded
> from data_table_here
> where timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
> group by sensor_id, date_trunc( 'minute', timestamptz_field_name_here )
>
>
> You could also use the left join "where is null" pattern to check that a prior record in the minute period of time does not exist for that same key. Something like this-
>
> select d1.sensor_id, date_trunc( 'minute', d1.timestamptz_field_name_here ), d1.timestamptz_field_name_here as last_value_recorded
> from data_table_here as d1
> left join data_table_here as prior_d1 ON prior_d1.sensor_id = d1.sensor_id AND prior_d1.timestamptz_field_name_here < d1.timestamptz_field_name_here and prior_d1.timestamptz_field_name_here >= date_trunc( 'minute', d1.timestamptz_field_name_here )
> where d1.timestamptz_field_name_here BETWEEN NOW() - interval '1 hour' and NOW()
>
>
> >
>
>