Thread: Find overlapping time intervals, how?
Say, we have uptimes from several servers: Server up_from up_to ------ ------- ------- s1 0:00 8:00 s1 10:00 20:00 s1 22:00 24:00 (would better be a timestamp with 0:00 and next day) s2 0:00 4:00 s2 6:00 23:00 Now we want to generate a report of the availability. But for a specific application both servers must be up at the same time. So the combined uptime would be from 0:00 to 4:00, from 6:00 to 8:00, from 10:00 to 20:00 and from 22:00 to 23:00. Then we can calculate an uptime percentage. (Another setup would be if the application is clustered, then the uptimes would be ORed instead of ANDed) What would be the most elegant way to do this? I started with a self join and 4 unions covering the 4 cases for start- end endtime: - Interval of s1 starts earlier and ends earlier than interval of s2 (up_from of s1 <= up_from of s2 and up_to of s1 <= up_to of s2 and up_to of s1 > up_from of s2) -- overlap condition Then the uptime interval is [up_from of s2, up_to of s1] ##### ##### - Interval of s2 starts earlier and ends earlier than interval of s1 Vice versa. ##### ##### - Interval of s1 contains interval of s2 (up_from of s1 <= up_from of s2 and up_to of s1 >= up_to of s2) Then the uptime interval is [up_from of s1, up_to of s1] ######## #### - Interval of s2 contains interval of s1 Vice versa. #### ######## But this looks ugly. Any ideas? I thought of area functions when using rectangles instead of times, but I don't know if that's good. -- PGP/GPG Key-ID: http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xB5A1AFE1
On Tuesday 09 September 2003 02:29 pm, Holger Marzen wrote: > Say, we have uptimes from several servers: > > Server up_from up_to > ------ ------- ------- > s1 0:00 8:00 > s1 10:00 20:00 > s1 22:00 24:00 (would better be a timestamp with 0:00 and next > day) s2 0:00 4:00 > s2 6:00 23:00 > > > Now we want to generate a report of the availability. But for a specific > application both servers must be up at the same time. So the combined > uptime would be from 0:00 to 4:00, from 6:00 to 8:00, from 10:00 to > 20:00 and from 22:00 to 23:00. Then we can calculate an uptime > percentage. > > (Another setup would be if the application is clustered, then the > uptimes would be ORed instead of ANDed) > > What would be the most elegant way to do this? I started with a self > join and 4 unions covering the 4 cases for start- end endtime: > > - Interval of s1 starts earlier and ends earlier than interval of s2 > (up_from of s1 <= up_from of s2 and > up_to of s1 <= up_to of s2 and > up_to of s1 > up_from of s2) -- overlap condition > Then the uptime interval is [up_from of s2, up_to of s1] > ##### > ##### > > - Interval of s2 starts earlier and ends earlier than interval of s1 > Vice versa. > ##### > ##### > > - Interval of s1 contains interval of s2 > (up_from of s1 <= up_from of s2 and > up_to of s1 >= up_to of s2) > Then the uptime interval is [up_from of s1, up_to of s1] > ######## > #### > > - Interval of s2 contains interval of s1 > Vice versa. > #### > ######## > > But this looks ugly. > I haven't tested this; but what if you: 1. Use timestamps instead of times to account for crossing over midnight. 2. Using aliases t1 and t2 for the source data table: SELECT case when t1.up_from >= t2.up_from then t1.up_from as start_time else t2.up_from as start_time, case when t1.up_to <= t2.up_to then t1.up_to as end_time else t2.up_to as end_time, end_time - start_time as avail_time INTO temp table temp_avail FROM tablename as t1, tablename as t2 WHERE t1.server <> t2.server and (t2.upfrom between t1.up_from and t1_up_to OR t2.up_to between t1.up_from and t1_up_to) and t1.server = 's1'; 3. Now you can run a query on temp_avail for stats. ??? Andrew Gould
On Tue, Sep 09, 2003 at 09:29:58PM +0200, Holger Marzen wrote: > Say, we have uptimes from several servers: > > Server up_from up_to > ------ ------- ------- > s1 0:00 8:00 > s1 10:00 20:00 > s1 22:00 24:00 (would better be a timestamp with 0:00 and next day) > s2 0:00 4:00 > s2 6:00 23:00 You better get a timestamp field for the up_from and up_to, because it's going to be very difficult to deal with only times (unless you want a report of a single day). I've done a similar thing with something like: select max(s1_uptime.up_from, s2_uptime.up_from) as start, min(s1_uptime.up_to, s2_uptime.up_to) as end from (select up_from, up_to from table where server=s1) as s1_uptime, (select up_from, up_to from table where server=s2) as s2_uptime where (s1_uptime.up_from, s1_uptime.up_to) overlaps (s2_uptime.up_from, s2_uptime.up_to) I'm not sure if the OVERLAPS operator works on released versions; I've only used it on 7.4, and even there it was undocumented until yesterday (I found it peeking at the source looking for something else). You also have to build max(timestamp, timestamp) and min(timestamp, timestamp) functions: create function max(timestamp, timestamp) returns timestamp as 'select case if $1 > $2 then $1 else $2' language sql; (see here for OVERLAPS: http://developer.postgresql.org/docs/postgres/functions-datetime.html ) -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) www.google.com: interfaz de linea de comando para la web.
holger@marzen.de (Holger Marzen) writes: > Say, we have uptimes from several servers: > > Server up_from up_to > ------ ------- ------- > s1 0:00 8:00 > s1 10:00 20:00 > s1 22:00 24:00 (would better be a timestamp with 0:00 and next day) > s2 0:00 4:00 > s2 6:00 23:00 > > > Now we want to generate a report of the availability. But for a specific > application both servers must be up at the same time. So the combined > uptime would be from 0:00 to 4:00, from 6:00 to 8:00, from 10:00 to > 20:00 and from 22:00 to 23:00. Then we can calculate an uptime > percentage. > > (Another setup would be if the application is clustered, then the > uptimes would be ORed instead of ANDed) > > What would be the most elegant way to do this? I started with a self > join and 4 unions covering the 4 cases for start- end endtime: Have you considered the OVERLAPS predicate? cctld=# create table server_on ( cctld(# name character varying, cctld(# up_from timestamptz, cctld(# up_to timestamptz); CREATE TABLE cctld=# insert into server_on (name, up_from, up_to) values ('s1', '2003-08-01 00:00', '2003-08-01 08:00'); INSERT 38680519 1 cctld=# insert into server_on (name, up_from, up_to) values ('s1', '2003-08-01 10:00', '2003-08-01 20:00'); INSERT 38680520 1 cctld=# insert into server_on (name, up_from, up_to) values ('s1', '2003-08-01 22:00', '2003-08-01 24:00'); ERROR: Bad timestamp external representation '2003-08-01 24:00' cctld=# insert into server_on (name, up_from, up_to) values ('s1', '2003-08-01 22:00', '2003-08-02 00:00'); INSERT 38680521 1 cctld=# insert into server_on (name, up_from, up_to) values ('s2', '2003-08-01 00:00', '2003-08-01 04:00'); INSERT 38680522 1 cctld=# insert into server_on (name, up_from, up_to) values ('s1', '2003-08-01 06:00', '2003-08-01 23:00'); INSERT 38680523 1 cctld=# select * from server_on a, server_on b where (a.up_from, a.up_to) overlaps (b.up_from, b.up_to); name | up_from | up_to | name | up_from | up_to ------+------------------------+------------------------+------+------------------------+------------------------ s1 | 2003-08-01 00:00:00-04 | 2003-08-01 08:00:00-04 | s1 | 2003-08-01 00:00:00-04 | 2003-08-01 08:00:00-04 s1 | 2003-08-01 00:00:00-04 | 2003-08-01 08:00:00-04 | s2 | 2003-08-01 00:00:00-04 | 2003-08-01 04:00:00-04 s1 | 2003-08-01 00:00:00-04 | 2003-08-01 08:00:00-04 | s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 s1 | 2003-08-01 10:00:00-04 | 2003-08-01 20:00:00-04 | s1 | 2003-08-01 10:00:00-04 | 2003-08-01 20:00:00-04 s1 | 2003-08-01 10:00:00-04 | 2003-08-01 20:00:00-04 | s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 s1 | 2003-08-01 22:00:00-04 | 2003-08-02 00:00:00-04 | s1 | 2003-08-01 22:00:00-04 | 2003-08-02 00:00:00-04 s1 | 2003-08-01 22:00:00-04 | 2003-08-02 00:00:00-04 | s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 s2 | 2003-08-01 00:00:00-04 | 2003-08-01 04:00:00-04 | s1 | 2003-08-01 00:00:00-04 | 2003-08-01 08:00:00-04 s2 | 2003-08-01 00:00:00-04 | 2003-08-01 04:00:00-04 | s2 | 2003-08-01 00:00:00-04 | 2003-08-01 04:00:00-04 s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 | s1 | 2003-08-01 00:00:00-04 | 2003-08-01 08:00:00-04 s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 | s1 | 2003-08-01 10:00:00-04 | 2003-08-01 20:00:00-04 s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 | s1 | 2003-08-01 22:00:00-04 | 2003-08-02 00:00:00-04 s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 | s1 | 2003-08-01 06:00:00-04 | 2003-08-01 23:00:00-04 (13 rows) Based on the overlaps, you can generate the period during which the server was up during the day. I think Celko had an exercise like this in _SQL For Smarties_. At any rate, that's not QUITE up to being the total answer, but I think you'll find "overlaps" will help make some complex joins go away. -- select 'cbbrowne' || '@' || 'libertyrms.info'; <http://dev6.int.libertyrms.com/> Christopher Browne (416) 646 3304 x124 (land)
>> Now we want to generate a report of the availability. But for a specific application both servers must be up at the same time. << Chapter 13 in SQL FOR SMARTIES. Also see Rick Snodgrass's book on temporal SQL queries. But first thing, use timestamps or this is going to be stinking mess.
Alvaro Herrera Munoz <alvherre@dcc.uchile.cl> writes: > I'm not sure if the OVERLAPS operator works on released versions; Yes, it's been there since 7.2 or so, though sadly lacking in documentation :-(. But I don't see any obvious way that it helps for this problem. regards, tom lane
On Tue, 9 Sep 2003, Alvaro Herrera Munoz wrote: > (see here for OVERLAPS: > http://developer.postgresql.org/docs/postgres/functions-datetime.html ) Overlap works great and simplifies the SQL. Nice. -- PGP/GPG Key-ID: http://blackhole.pca.dfn.de:11371/pks/lookup?op=get&search=0xB5A1AFE1