Thread: Best database structure for timely ordered values

Best database structure for timely ordered values

From
Reiner Dassing
Date:
Hi!

I am thinking about the best structure of a database to describe timely ordered
values.
The timely ordered values are the results of different sensors measuring
environmental
parameters like temperatur, pressure, humidity.

The database has to define the kind of measurements, kind of sensors, place of
them,
ie., more or less static information.
The main part will be the measurement values ordered by the time or epoch due to
UTC.
The primary index must be the epoch.
As there will be no deletion from this data I fear - due to the internal
representation
of B-trees - the performance will degrade very soon. The expected number of
measurements
will easily reach some millions.

Any comments or ideas?
Reiner


Re: Best database structure for timely ordered values

From
Tom Lane
Date:
Reiner Dassing <dassing@wettzell.ifag.de> writes:
> The primary index must be the epoch.
> As there will be no deletion from this data I fear - due to the
> internal representation of B-trees - the performance will degrade very
> soon.

Nonsense.  btree should work just fine for that.  Use a timestamp
column for the primary key, and away you go.

(Actually, time alone doesn't seem like it'd be necessarily unique,
so maybe you don't want to call it a primary key.  But certainly
you can make a non-unique index on that column.)
        regards, tom lane


Re: Best database structure for timely ordered values

From
Bruce Momjian
Date:
> Reiner Dassing <dassing@wettzell.ifag.de> writes:
> > The primary index must be the epoch.
> > As there will be no deletion from this data I fear - due to the
> > internal representation of B-trees - the performance will degrade very
> > soon.
> 
> Nonsense.  btree should work just fine for that.  Use a timestamp
> column for the primary key, and away you go.
> 
> (Actually, time alone doesn't seem like it'd be necessarily unique,
> so maybe you don't want to call it a primary key.  But certainly
> you can make a non-unique index on that column.)

I assume the user is concerned about non-balanced btrees.  Ours are
auto-balancing.


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Best database structure for timely ordered values

From
Reiner Dassing
Date:
Bruce Momjian wrote:
> 
> > Reiner Dassing <dassing@wettzell.ifag.de> writes:
> > > The primary index must be the epoch.
> > > As there will be no deletion from this data I fear - due to the
> > > internal representation of B-trees - the performance will degrade very
> > > soon.
> >
> > Nonsense.  btree should work just fine for that.  Use a timestamp
> > column for the primary key, and away you go.
> >
> > (Actually, time alone doesn't seem like it'd be necessarily unique,
> > so maybe you don't want to call it a primary key.  But certainly
> > you can make a non-unique index on that column.)
> 
> I assume the user is concerned about non-balanced btrees.  
That is correct! 
As I tested an example database with about 7 million entries on PostgreSQL V
6.5.2
and the result of
select count(*) from table;
tooks about 3 minutes I have this concern.
May be version 7.0.3 will give a better result.

> Ours are
> auto-balancing.
> 
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

--
  Reiner Dassing


Re: Best database structure for timely ordered values

From
Bruce Momjian
Date:
> > I assume the user is concerned about non-balanced btrees.  
> That is correct! 
> As I tested an example database with about 7 million entries on PostgreSQL V
> 6.5.2
> and the result of
> select count(*) from table;
> tooks about 3 minutes I have this concern.
> May be version 7.0.3 will give a better result.

The larger problem is that count(*) doesn't use the index at all.  It
just does a sequential scan of the heap table.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Best database structure for timely ordered values

From
Reiner Dassing
Date:
Bruce Momjian wrote:
> 
> The larger problem is that count(*) doesn't use the index at all.  It
> just does a sequential scan of the heap table.

Here is another example, which is even worse:
select value from table where id=1 order by epoch desc limit 1;
to ask for the last stored value.

This request needs about 5 minutes to complete.

(But please consider that the postgres server has some other requests
to serve.
The machine running this server is an alpha server with Tru64)

The explain statements results in:

explain select * from table where id=1 order by epoche desc limit 1;
NOTICE:  QUERY PLAN:

Sort  (cost=12692.74 rows=202175 width=16) ->  Index Scan using wetter_pkey on table  (cost=12692.74 rows=202175
width=16)

EXPLAIN


--
  Reiner Dassing


Re: Best database structure for timely ordered values

From
Tom Lane
Date:
Reiner Dassing <dassing@wettzell.ifag.de> writes:
> Here is another example, which is even worse:
> select value from table where id=1 order by epoch desc limit 1;
> to ask for the last stored value.
> This request needs about 5 minutes to complete.

Hm.  That should produce a decent plan given the right indexes.
On 7.0.2 I see:

play=> create table foo (id int, epoch timestamp primary key);
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'foo_pkey' for table 'foo'
CREATE
play=> explain select * from foo where id=1 order by epoch desc limit 1;
NOTICE:  QUERY PLAN:

Index Scan Backward using foo_pkey on foo  (cost=0.00..62.50 rows=10 width=12)

EXPLAIN

which ought to work pretty well unless id=1 is very rare.

> The explain statements results in:

> explain select * from table where id=1 order by epoche desc limit 1;
> NOTICE:  QUERY PLAN:

> Sort  (cost=12692.74 rows=202175 width=16)
>   -> Index Scan using wetter_pkey on table  (cost=12692.74 rows=202175 width=16)

That's not very informative, since you haven't told us what that index is...
        regards, tom lane