Re: Counting # of consecutive rows with specified value(s)? - Mailing list pgsql-general

From David Johnston
Subject Re: Counting # of consecutive rows with specified value(s)?
Date
Msg-id FCDB9A00-31E0-48A2-98BE-39977344699D@yahoo.com
Whole thread Raw
In response to Counting # of consecutive rows with specified value(s)?  (Ken Tanzer <ken.tanzer@gmail.com>)
List pgsql-general
On Jun 6, 2012, at 22:20, Ken Tanzer <ken.tanzer@gmail.com> wrote:

> I'm working with an attendance table, where each person gets a record for each day of class, with an attendance code
(ABSENT,ATTENDED, ...).  I'm trying to figure out how to get the number of consecutive absences a person has.  I'm
guessingthis can be done without writing a function, but I need some help here. 
>
> I can currently test whether someone has at least a specified number of consecutive absences with the query below,
butit would be  better to get the actual number. 
>
> As a second question, what about getting the number of consecutive records for a set of values?  (e.g.,
attendance_codeIN ('ATTENDED','EXCUSED') 
>
> Any ideas or suggestions?  Thanks.
>
> Ken
>
> This query checks for 4 consecutive absences:
>
> SELECT client_id,
>     array(
>       SELECT attendance_code
>       FROM attendance
>       WHERE client_id=enrollment.client_id
>       ORDER BY attended_on DESC
>       LIMIT 4
>     )=array_fill('ABSENT'::varchar,array[4]) AS absent_last_4
>   FROM enrollment;
>

Quick thought if you have window functions.  If you perform a rank over the appropriate subset of columns (with order
by)you might be able to get consecutive records with matching codes to have the same rank.  The in an outer query you
cansimply sum with the rank as the group.  I am unsure which function gives you equal values for ties, or whether this
approachwill even work, but it seems worth considering at least. Using a min date your result might be readable as
"startingon 9/1, 4 absences; starting on 9/5, 21 attendeds; etc...". 

Consider using a frame of 1 prior row to see whether a given date/code is different than the immediately preceding day.
If it is that day qualifies as the possible beginning of a chain otherwise it has to be added to an existing chain.
Youprobably need to do this first then use the dates of the applicable chain starts (and ends maybe...) as fields in
thepartition mentioned above. 

HTH

David J.

pgsql-general by date:

Previous
From: Ken Tanzer
Date:
Subject: Counting # of consecutive rows with specified value(s)?
Next
From: Craig Ringer
Date:
Subject: Re: Procedural Languages