Thread: Best database structure for timely ordered values
Hi! I am thinking about the best structure of a database to describe timely ordered values. The timely ordered values are the results of different sensors measuring environmental parameters like temperatur, pressure, humidity. The database has to define the kind of measurements, kind of sensors, place of them, ie., more or less static information. The main part will be the measurement values ordered by the time or epoch due to UTC. The primary index must be the epoch. As there will be no deletion from this data I fear - due to the internal representation of B-trees - the performance will degrade very soon. The expected number of measurements will easily reach some millions. Any comments or ideas? Reiner
Reiner Dassing <dassing@wettzell.ifag.de> writes: > The primary index must be the epoch. > As there will be no deletion from this data I fear - due to the > internal representation of B-trees - the performance will degrade very > soon. Nonsense. btree should work just fine for that. Use a timestamp column for the primary key, and away you go. (Actually, time alone doesn't seem like it'd be necessarily unique, so maybe you don't want to call it a primary key. But certainly you can make a non-unique index on that column.) regards, tom lane
> Reiner Dassing <dassing@wettzell.ifag.de> writes: > > The primary index must be the epoch. > > As there will be no deletion from this data I fear - due to the > > internal representation of B-trees - the performance will degrade very > > soon. > > Nonsense. btree should work just fine for that. Use a timestamp > column for the primary key, and away you go. > > (Actually, time alone doesn't seem like it'd be necessarily unique, > so maybe you don't want to call it a primary key. But certainly > you can make a non-unique index on that column.) I assume the user is concerned about non-balanced btrees. Ours are auto-balancing. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > > Reiner Dassing <dassing@wettzell.ifag.de> writes: > > > The primary index must be the epoch. > > > As there will be no deletion from this data I fear - due to the > > > internal representation of B-trees - the performance will degrade very > > > soon. > > > > Nonsense. btree should work just fine for that. Use a timestamp > > column for the primary key, and away you go. > > > > (Actually, time alone doesn't seem like it'd be necessarily unique, > > so maybe you don't want to call it a primary key. But certainly > > you can make a non-unique index on that column.) > > I assume the user is concerned about non-balanced btrees. That is correct! As I tested an example database with about 7 million entries on PostgreSQL V 6.5.2 and the result of select count(*) from table; tooks about 3 minutes I have this concern. May be version 7.0.3 will give a better result. > Ours are > auto-balancing. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 -- Reiner Dassing
> > I assume the user is concerned about non-balanced btrees. > That is correct! > As I tested an example database with about 7 million entries on PostgreSQL V > 6.5.2 > and the result of > select count(*) from table; > tooks about 3 minutes I have this concern. > May be version 7.0.3 will give a better result. The larger problem is that count(*) doesn't use the index at all. It just does a sequential scan of the heap table. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > > The larger problem is that count(*) doesn't use the index at all. It > just does a sequential scan of the heap table. Here is another example, which is even worse: select value from table where id=1 order by epoch desc limit 1; to ask for the last stored value. This request needs about 5 minutes to complete. (But please consider that the postgres server has some other requests to serve. The machine running this server is an alpha server with Tru64) The explain statements results in: explain select * from table where id=1 order by epoche desc limit 1; NOTICE: QUERY PLAN: Sort (cost=12692.74 rows=202175 width=16) -> Index Scan using wetter_pkey on table (cost=12692.74 rows=202175 width=16) EXPLAIN -- Reiner Dassing
Reiner Dassing <dassing@wettzell.ifag.de> writes: > Here is another example, which is even worse: > select value from table where id=1 order by epoch desc limit 1; > to ask for the last stored value. > This request needs about 5 minutes to complete. Hm. That should produce a decent plan given the right indexes. On 7.0.2 I see: play=> create table foo (id int, epoch timestamp primary key); NOTICE: CREATE TABLE/PRIMARY KEY will create implicit index 'foo_pkey' for table 'foo' CREATE play=> explain select * from foo where id=1 order by epoch desc limit 1; NOTICE: QUERY PLAN: Index Scan Backward using foo_pkey on foo (cost=0.00..62.50 rows=10 width=12) EXPLAIN which ought to work pretty well unless id=1 is very rare. > The explain statements results in: > explain select * from table where id=1 order by epoche desc limit 1; > NOTICE: QUERY PLAN: > Sort (cost=12692.74 rows=202175 width=16) > -> Index Scan using wetter_pkey on table (cost=12692.74 rows=202175 width=16) That's not very informative, since you haven't told us what that index is... regards, tom lane