Re: Given a set of daterange, finding the continuous range thatincludes a particular date (aggregates) - Mailing list pgsql-general

From mariusz
Subject Re: Given a set of daterange, finding the continuous range thatincludes a particular date (aggregates)
Date
Msg-id 1519814946.32586.203.camel@mtvk.pl
Whole thread Raw
In response to Re: Given a set of daterange, finding the continuous range thatincludes a particular date (aggregates)  (Ken Tanzer <ken.tanzer@gmail.com>)
List pgsql-general
hi,

On Tue, 2018-02-27 at 16:15 -0800, Ken Tanzer wrote:
> Hi.  Thanks so much for your assistance.  This is definitely getting
> the results I was looking for.  It is still syntacticallly more
> cumbersome than I might have hoped, but I can work with it.  So I've
> got two follow questions/issues:
> 
> 
> 1)  I can see there are many, more complex, options for aggregates,
> which I am trying to wrap my mind around.  I'm wondering if any of
> these (esp. partial aggregates/combine functions, final functions or
> moving aggregates) could be used to streamline this into a single
> function call, or if that is barking up a dead tree.
> 
i'm not an expert and i don't use fancy aggregates too often, just got a
simple solution, so i may be wrong, but i don't see simple solution as
one function. so unless someone comes with a better idea, you have to
stay with this, which isn't complex as the idea and solution is really
very simple (simplicity is somewhat visually hidden in that case by
converting dates to ranges and to boundary dates again (and again to
ranges for visualizing result)).
but it requires additional sort for each window as you can see in
explain which is the cost to pay
> 
> 2)  I'm sure at this point I must be being dense, but after re-reading
> docs multiple times, I am still confused about one piece of this:
> 
> 
>         > first window (within lower() func) extends its initial
>         (earliest) range
>         > to right for every next row continuous with current range
>         (and jumps to
>         > new start on discontinuity), thus over(order by d ASC)
>         > second window (within upper() func) extends its initial
>         (latest) range
>         > to left for every next row continuous with current range
>         (and jumps to
>         > new end on discontinuity), thus over(order by d DESC)
>         > partition by client_id within each over() is to not mix
>         client_ids'
>         > dateranges
>         >
>         
>         
> which is what is it exactly that is triggering Postgresql to know
> there is a discontinuity and to start with a new range?  And is it
> based on the input or the output values?  Based on PARTITION BY
> client_id ORDER BY d, I might have thought it was d.  But that doesn't
> seem to be right.  So is it something about what agg_daterange is
> returning, and if so what?  Again, sorry for being dense about this
> one.
> 
posgresql itself doesn't know nor care about those discontinuities, all
it cares about is partition by client_id to not mix client_ids, and
applying our agg function for rows in order of our choice.

here again i remind you, you don't really want to sort by d::daterange
for second descending window, but end_date because desc order of range
will not be the same as desc order of end_date in general case,
and start_date for first window (but as i already said, for that first
ascending window it's cosmetics as order of d::daterange and
start_date::date will be the same.
that doesn't really matter in your case of exclusive ranges but matters
in more general case.

and back to your question, we find discontinuity in our function. see
the query

SELECT CASE WHEN d1 && d2 OR d1 -|- d2 THEN d1 + d2 ELSE d2 END

where d1 is internal state value of aggregate and d2 is current row's
daterange value.
we check for continuity by d1 && d2 OR d1 -|- d2, and if it is
continuous than we just extend result (return value for current row and
new state value of agg) by d1+d2,
otherwise, when we find discontinuity, we forget d1 (agg state value
until now) and simply return d2 being current row's range, thus starting
with new range.

hope that explains enough. as i already said, i'm not an expert, i'm
just coincidentally working currently on my semi-toy project which
utilizes dateranges quite heavily.
anyway, feel free to ask if you have any further questions. for now i'm
glad i could help somehow.

regards,
mariusz jadczak

> 
> Thanks!
> 
> 
> Ken
> 
> 
> 
> -- 
> 
> AGENCY Software  
> A Free Software data system
> By and for non-profits
> http://agency-software.org/
> https://demo.agency-software.org/client
> 
> ken.tanzer@agency-software.org
> (253) 245-3801
> 
> 
> Subscribe to the mailing list to
> learn more about AGENCY or
> follow the discussion.




pgsql-general by date:

Previous
From: Michael Krüger
Date:
Subject: Re: Parallel Query - Can it be used within functions?
Next
From: pkashimalla
Date:
Subject: How to avoid trailing zero (after decimal point) for numeric typecolumn