Re: Using Postgres to store high volume streams of sensor readings - Mailing list pgsql-general

From Greg Smith
Subject Re: Using Postgres to store high volume streams of sensor readings
Date
Msg-id Pine.GSO.4.64.0811211246270.23150@westnet.com
Whole thread Raw
In response to Re: Using Postgres to store high volume streams of sensor readings  (Sam Mason <sam@samason.me.uk>)
Responses Re: Using Postgres to store high volume streams of sensor readings  ("Ciprian Dorin Craciun" <ciprian.craciun@gmail.com>)
List pgsql-general
On Fri, 21 Nov 2008, Sam Mason wrote:

> It's not quite what you're asking for; but have you checked out any
> of the databases that have resulted from the StreamSQL research?

A streaming database approach is in fact ideally suited to handling this
particular problem.  Looking at the original request here:

> * for a given client (and sensor), and time interval, I need the min,
> max, and avg of the values;
> * for a given time interval (and sensor), I need min, max, and avg of
> the values;

The most efficient way possible to compute these queries is to buffer the
full interval worth of data in memory as the values are being inserted,
compute these aggregates once the time window for the interval has ended,
then write a summarized version of the data.  Doing that sort of thing and
then operating on the aggregated data, rather than maintaining a bulky
index covering every single row, is exactly the sort of thing a good
streaming database would handle for you.

I can't comment on the current state of Borealis.  But as the original
focus of the streaming database research that spawned Truviso where I work
was accelerating data capture from sensor networks, I know this general
style of approach is quite beneficial here.  For example,
http://telegraph.cs.berkeley.edu/telegraphcq/v2.1/ is an open-source
implementation from that research that's integrated with an older version
of PostgreSQL.  If you look at the "windowed aggregates" example there, it
shows what a streaming query similar to the requirements here would look
like:  an average and other statistics produced on a per-interval basis.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

pgsql-general by date:

Previous
From: paulo matadr
Date:
Subject: Res: converter pgplsql funcion
Next
From: Alvaro Herrera
Date:
Subject: Re: Postgres mail list traffic over time