Re: Storing sensor data - Mailing list pgsql-performance

From Greg Jaman
Subject Re: Storing sensor data
Date
Msg-id b72893ad0905281156k4f4a9d07v398cb46938bf9cf1@mail.gmail.com
Whole thread Raw
In response to Re: Storing sensor data  (Ivan Voras <ivoras@freebsd.org>)
Responses Re: Storing sensor data  (Greg Jaman <gjaman@gmail.com>)
List pgsql-performance
I currently have a database doing something very similar.  I setup partition tables with predictable names based on the the data's timestamp week number eg:  (Data_YYYY_WI). 

I have a tigger on the parent partition table to redirect data to the correct partition( tablename:='Data_' || to_char('$NEW(ts)'::timestamptz, 'IYYY_IW') ) .  then I use dynamic sql to do the insert.  I did some optimization by writting it in pl/TCL and using global variables to store prepared insert statements.

Most queries for me are based  on the date and we have decent performance with our current setup.  For last/current sensor data we just store the last dataID in the sensor record.  I haven't thought of a better way yet.  After batch inserts we caculate the last reading for each participating sensorID inserted.

With partition tables we struggled with the query to get the lastest data :  select * from "Data" where "sensorID"=x order by ts limit 1  -- for parition tables.   See (http://archives.postgresql.org/pgsql-performance/2008-11/msg00284.php)  





On Thu, May 28, 2009 at 7:55 AM, Ivan Voras <ivoras@freebsd.org> wrote:
2009/5/28 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
> Ivan Voras wrote:
>>
>> I need to store data about sensor readings. There is a known (but
>> configurable) number of sensors which can send update data at any time.
>> The "current" state needs to be kept but also all historical records.
>> I'm trying to decide between these two designs:
>>
>> 1) create a table for "current" data, one record for each sensor, update
>> this table when a sensor reading arrives, create a trigger that would
>> transfer old record data to a history table (of basically the same
>> structure)
>> 2) write only to the history table, use relatively complex queries or
>> outside-the-database magic to determine what the "current" values of the
>> sensors are.
>
> 3) write only to the history table, but have an INSERT trigger to update the
> table with "current" data. This has the same performance characteristics as
> 1, but let's you design your application like 2.

Excellent idea!

> I think I'd choose this approach (or 2), since it can handle out-of-order or
> delayed arrival of sensor readings gracefully (assuming they are timestamped
> at source).

It seems like your approach is currently the winner.

> If you go with 2, I'd recommend to still create a view to encapsulate the
> complex query for the current values, to make the application development
> simpler. And if it gets slow, you can easily swap the view with a table,
> updated with triggers or periodically, without changing the application.
>
>> The volume of sensor data is potentially huge, on the order of 500,000
>> updates per hour. Sensor data is few numeric(15,5) numbers.
>
> Whichever design you choose, you should also consider partitioning the data.

I'll look into it, but we'll first see if we can get away with
limiting the time the data needs to be available.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

Previous
From: Fabrix
Date:
Subject: Scalability in postgres
Next
From: Greg Jaman
Date:
Subject: Re: Storing sensor data