I'm working on the design of a database for time series data collected by a variety of meteorological sensors. Many sensors share the same sampling scheme, but not all. I initially thought it would be a good idea to have a table identifying each parameter (variable) that the sensors report on:
CREATE TABLE parameters ( parameter_id serial PRIMARY KEY, parameter_name character_varying(200) NOT NULL, ... )
and then store the data in a table referencing it:
CREATE TABLE series ( record_id serial PRIMARY KEY, parameter_id integer REFERENCES parameters, reading ???? ... )
but of course, the data type for the parameters may vary, so it's impossible to assign a data type to the "reading" column. The number of variables measured by the sensors is quite large and may grow or decrease over time, and grouping them into subjects (tables) is not clear, so it's not simple to just assign them to different columns.
I've been trying to search for solutions in various sources, but am having trouble finding relevant material. I'd appreciate any advice.
If you are not keen on using PostgreSQL, you could have a look at http://opentsdb.net/
That was one project we found interesting when we were faced with a similar problem a couple of years ago. In the end, many other factors made us opt for Cassandra. We started with PostgreSQL. But our requirements included, among others, ability to add new devices/parameters quickly. So the persistence layer was mostly a data sink and we planned to move cleansed/aggregated data to PostgreSQL for analysis. Most of the master data was also in PostgreSQL - devicies, parameters, units.