Re: partitioning a dataset + employing hysteresis condition - Mailing list pgsql-general

From Amit Dor-Shifer
Subject Re: partitioning a dataset + employing hysteresis condition
Date
Msg-id CAAznTxGDtD=Rg+0AXuM5+33P-B52p325JXYuPLY8PCUTDEjPMA@mail.gmail.com
Whole thread Raw
In response to Re: partitioning a dataset + employing hysteresis condition  (David Johnston <polobo@yahoo.com>)
List pgsql-general


On Wed, Nov 16, 2011 at 4:58 PM, David Johnston <polobo@yahoo.com> wrote:
On Nov 15, 2011, at 15:28, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:

On 14/11/11 18:35, Amit Dor-Shifer wrote:

On Mon, Nov 14, 2011 at 4:29 PM, Amit Dor-Shifer <amit.dor.shifer@gmail.com> wrote:
Hi,
I've got this table:
create table phone_calls
(
    start_time timestamp,
    device_id integer,
    term_status integer
);

It describes phone call events. A 'term_status' is a sort-of an exit status for the call, whereby a value != 0 indicates some sort of error.
Given that, I wish to retrieve data on devices with a persisting error on them, of a specific type. I.E. that their last term_status was, say 2. I'd like to employ some hysteresis on the query: only consider a device as errorring if:
1. the last "good" (0) term_status pre-dates a "bad" (2) term_status.
2. it has at least N "bad" term_status events following the last "good" one.
3. The time span between the first "bad" term_status event and the last one is >= T minutes

For instance, w/the following data set:


Alternative thought,

Have a Boolean field which is set to true for non-zero entries and false for zeros.  Upon entering a zero into the table, for a given device, set all currently true records to false.  Combine with a partial index on the true and you can quickly get a listing of all devices in error mode and all the recent error entries.

David J.

Really appreciate the attention. Thanks!

Here's what I've so-far come up with:

SELECT pc.device_id, MAX(lsc.last_successful) AS last_successful, MIN(pc.start_time) AS faulting_since
FROM (
SELECT MAX(start_time) AS last_successful, device_id FROM phone_calls
WHERE term_status IS NOT NULL AND term_status = 0
GROUP BY device_id
UNION
SELECT NULL AS last_successful, device_id FROM phone_calls
GROUP BY device_id
HAVING EVERY(term_status = 2) = TRUE
) AS lsc,
phone_calls pc
WHERE
pc.device_id=lsc.device_id
AND pc.term_status=2
AND (lsc.last_successful is NULL OR pc.start_time > lsc.last_successful)
GROUP BY pc.device_id
HAVING MIN(pc.start_time) < ?
AND COUNT(term_status) >= ?

The two parameters to the query are T & N, in order, with a slight change: T is a timestamp, so instead of specifying the a time span of 2 minutes, I pass it as NOW() - INTERVAL '2 minute'.

with T=NOW() - INTERVAL '2 minute'  I get the following on the a/m dataset:
N=4:
40;"2010-07-01 00:40:22";"2010-07-01 00:41:22"

N=3:
40;"2010-07-01 00:40:22";"2010-07-01 00:41:22"
50;"2010-07-01 12:00:00";"2010-07-01 12:01:00"
2;"2011-11-16 21:56:59.52107";"2011-11-16 21:57:59.52107"

N=2:
40;"2010-07-01 00:40:22";"2010-07-01 00:41:22"
2;"2011-11-16 21:55:16.88869";"2011-11-16 21:56:16.88869"
30;"2010-07-01 15:14:33";"2010-07-01 15:15:33"
50;"2010-07-01 12:00:00";"2010-07-01 12:01:00"

* The bit with the union is to accommodate for devices which never see a single successful term_status.

Thanks a lot for the helpful hints :)




pgsql-general by date:

Previous
From: Ivan Sergio Borgonovo
Date:
Subject: Re: how to drop function?
Next
From: kyp404
Date:
Subject: PostgreSQL-Slony error?