Events tables, model discussion in regards to the performances - Mailing list pgsql-general

From Marc Mamin
Subject Events tables, model discussion in regards to the performances
Date
Msg-id CA896D7906BF224F8A6D74A1B7E54AB3198757@JENMAIL01.ad.intershop.net
Whole thread Raw
List pgsql-general

Hallo,

I have a large amount of time based events to aggregate, with a finite list of possible events.
the events are logged one at the time:

timestamp_1 : event_1
timestamp_2 : event_2
timestamp_3 : event_1
...

My idea is to prepare the data before to import them in order to get one separate column per event
along with the desired time aggregation

timestamp : houroffset : is_event_1 :is_event_2 : is_event_3

t1                 132         True        null         null
t2                 132         null        True         null
t3                 133         True        null         null
t4                 134         null        null         True

....

With this model,  I can easily count my events per time periode:

select
  count(is_event_1) as C1,
  count(is_event_2) as C2,
  count(is_event_3) as C3
from foo
group by houroffset.

(the real model is more compex as they are different categories and volatile properties associated with the events)

and now my questions:
---------------------

- Will the above query acces the "is_event columns", or get the informmation only from the nulls bitmap within the row headers ?

- How does the splitting of the "event" information affect the row headers ?  Should  I rather define a single event column and aggregate my data using "case when else end" clauses ?

- is a where clause "where is_event_x is not null" more performant than "where is_event_x is true" as the null bitmap can thoretically be used.

- in my model, I couldn't find a way to build an index that could be used to query a single event:

   select ...
   from foo
   where is_event_1=true (|| is_event_1 is not null)

  I first imagined an index like create index foo_i on foo ((is_event_1 IS NOT NULL),  (is_event_2  IS NOT NULL),...
  But it is apparently not used  when I have a single event in my where clause.
 
  A solution would be to add a further column in my table containing the event_id and to index it:
 
  timestamp : houroffset : is_event_1 :is_event_2 : is_event_3 : event_id
 
  t1                 132         True        null         null          1
  t2                 132         null        True         null          2
  t3                 133         True        null         null          1
  t4                 134         null        null         True          3
 
  but this extra column would be redundant with the nulls bitmap. Is there a way to avoid this duplication of information ?

 

- is there a performance gain at all when columns that are seldom used are placed at the end of the rows ?
 
 
- I 'll also be thankful for any comments and critics on my model. 

cheers,

Marc

pgsql-general by date:

Previous
From: "H.J. Sanders"
Date:
Subject: Re: FW: Male/female
Next
From: Karsten Hilbert
Date:
Subject: Re: FW: Male/female