Re: Storing sensor data - Mailing list pgsql-performance

From Ivan Voras
Subject Re: Storing sensor data
Date
Msg-id 9bbcef730905280806s6605d3bejd8d579be45c6f017@mail.gmail.com
Whole thread Raw
In response to Re: Storing sensor data  (Alexander Staubo <alex@bengler.no>)
Responses Re: Storing sensor data
List pgsql-performance
2009/5/28 Alexander Staubo <alex@bengler.no>:
> On Thu, May 28, 2009 at 2:54 PM, Ivan Voras <ivoras@freebsd.org> wrote:
>> The volume of sensor data is potentially huge, on the order of 500,000
>> updates per hour. Sensor data is few numeric(15,5) numbers.
>
> The size of that dataset, combined with the apparent simplicity of
> your schema and the apparent requirement for most-sequential access
> (I'm guessing about the latter two),

Your guesses are correct, except every now and then a random value
indexed on a timestamp needs to be retrieved.

> all lead me to suspect you would
> be happier with something other than a traditional relational
> database.
>
> I don't know how exact your historical data has to be. Could you get

No "lossy" compression is allowed. Exact data is needed for the whole dataset-

> If you require precise data with the ability to filter, aggregate and
> correlate over multiple dimensions, something like Hadoop -- or one of
> the Hadoop-based column database implementations, such as HBase or
> Hypertable -- might be a better option, combined with MapReduce/Pig to
> execute analysis jobs

This looks like an interesting idea to investigate. Do you have more
experience with such databases? How do they fare with the following
requirements:

* Storing large datasets (do they pack data well in the database? No
wasted space like in e.g. hash tables?)
* Retrieving specific random records based on a timestamp or record ID?
* Storing "inifinite" datasets (i.e. whose size is not known in
advance - cf. e.g. hash tables)

On the other hand, we could periodically transfer data from PostgreSQL
into a simpler database (e.g. BDB) for archival purposes (at the
expense of more code). Would they be better suited?

pgsql-performance by date:

Previous
From: Kenneth Marshall
Date:
Subject: Re: Storing sensor data
Next
From: Ivan Voras
Date:
Subject: Re: Storing sensor data