My initial inclination is to always build the simplest to understand system first. Space is cheap and pg is pretty efficient, engineering time is expensive and debugging time doubly so with a side of anxiety when production goes down. Also, it will allow more flexibility later on to describe your current setup semantically rathe than in a condensed form.
I would suggest building a simpler schema and benchmarking insert times and storage space.
If you go with your condensed form, I would suggest writing functions to extract any data that is more complicated than one expression involving one field.
Just my 2¢.
Best wishes,
Jim
On December 9, 2017 2:22:02 PM EST, Stefan Keller <sfkeller@gmail.com> wrote:
Hi,
Given this kind of sensors (Internet-of-Things) log and monitoring scenario:
* There are 3 production machines monitored every few seconds for
forthcoming (~2) years.
* Machine m1 is emitting 20 boolean and 20 float4 captured in sensors
(m1s1..m1s40).
* Machine m2 has same attributes as m1 plus 10+10 more (m2s1..m2s20).
* Machine m3: like m2 but half of the attributes are different.
* Queries are happening once every day, like:
SELECT m1s1,m1s2 FROM m1 WHERE logged BETWEEN '2017-11-01' AND '2017-11-30'".
So this is a kind of an "Immutable DB" with where there are
* rather static schema with sources which have overlapping attributes
* heavy writes,
* periodic reads
Would you model this schema also like my proposition, which saves
place but makes it little bit more complex to insert/update due to the
arrays?
create table m1 (
id bigint,
created timestamp,
b20 bit(20) default b'00000000000000000000',
farr20 float8[20]
);
:Stefan
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.