Re: range_agg - Mailing list pgsql-hackers

From David Fetter
Subject Re: range_agg
Date
Msg-id 20190705174531.GE24679@fetter.org
Whole thread Raw
In response to Re: range_agg  (Paul A Jungwirth <pj@illuminatedcomputing.com>)
Responses Re: range_agg  (Paul A Jungwirth <pj@illuminatedcomputing.com>)
List pgsql-hackers
On Fri, Jul 05, 2019 at 09:58:02AM -0700, Paul A Jungwirth wrote:
> On Mon, Jul 1, 2019 at 3:38 PM Jeff Davis <pgsql@j-davis.com> wrote:
> >
> > For getting into core though, it should be a more complete set of
> > related operations. The patch is implicitly introducing the concept of
> > a "multirange" (in this case, an array of ranges), but it's not making
> > the concept whole.
> >
> > What else should return a multirange? This patch implements the union-
> > agg of ranges, but we also might want an intersection-agg of ranges
> > (that is, the set of ranges that are subranges of every input). Given
> > that there are other options here, the name "range_agg" is too generic
> > to mean union-agg specifically.
> 
> Thanks for the review!
> 
> I like the idea of adding a multirange type that works like range
> types, although I'm not sure I want to build it. :-)
> 
> My own motivations for the range_agg patch are for temporal databases,
> where I have two use-cases: checking temporal foreign keys [1] and
> doing a Snodgrass "coalesce" operation [2]. The FKs use-case is more
> important. For coalesce I just immediately UNNEST the array, so a
> multirange would sort of "get in the way". It's no big deal if I can
> cast a multirange to an array, although for something that would run
> on every INSERT/UPDATE/DELETE I'd like to understand what the cast
> would cost us in performance. But coalesce isn't part of SQL:2011 and
> should be optional behavior or just something in an extension. The FKs
> use case matters to me a lot more, and I think a multirange would be
> fine for that. Also a multirange would solve the generics problem
> Pavel asked about. (I'll say more about that in a reply to him.)
> 
> I'm not convinced that an intersection aggregate function for
> multiranges would be used by anyone, but I don't mind building one.
> Every other *_agg function has an "additive" sense, not a
> "subtractive" sense. json{,b}_object_agg are the closest since you
> *could* imagine intersection semantics for those, but they are unions.
> So in terms of *naming* I think using "range_agg" for union semantics
> is natural and would fit expectations. (I'm not the first to name this
> function range_agg btw: [3]).
> 
> But there is clearly more than one worthwhile way to aggregate ranges:
> 
> - David suggested weighted_range_agg and covering_range_agg. At PGCon

If I understand the cases correctly, the combination of covering_range
and multi_range types covers all cases. To recap, covering_range_agg
assigns a weight, possibly 0, to each non-overlapping sub-range.  A
cast from covering_range to multi_range would meld adjacent ranges
with non-zero weights into single ranges in O(N) (N sub-ranges) time
and some teensy amount of memory for tracking progress of adjacent
ranges of non-zero weight.  Gaps would just be multi_range consisting
of the sub-ranges of the covering_range with weight 0, and wouldn't
require any tracking.

What have I missed?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



pgsql-hackers by date:

Previous
From: Paul A Jungwirth
Date:
Subject: Re: range_agg
Next
From: Paul A Jungwirth
Date:
Subject: Re: range_agg