Re: range_agg - Mailing list pgsql-hackers

From David Fetter
Subject Re: range_agg
Date
Msg-id 20200307221312.GV13804@fetter.org
Whole thread Raw
In response to Re: range_agg  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: range_agg  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Mar 07, 2020 at 04:06:32PM -0500, Tom Lane wrote:
> I wrote:
> > However, what I'm on about right at the moment is that I don't think
> > there should be any delta in that test at all.  As far as I can see,
> > the design idea here is that multiranges will be automatically created
> > over range types, and the user doesn't need to do that.  To my mind,
> > that means that they're an implementation detail and should not show up as
> > separately-owned objects, any more than an autogenerated array type does.
> 
> Actually ... have you given any thought to just deciding that ranges and
> multiranges are the same type?  That is, any range can now potentially
> contain multiple segments?  That would eliminate a whole lot of the
> tedious infrastructure hacking involved in this patch, and let you focus
> on the actually-useful functionality.

If we're changing range types rather than constructing a new
multi-range layer atop them, I think it would be helpful to have some
way to figure out quickly whether this new range type was contiguous.
One way to do that would be to include a "range cardinality" in the
data structure which be the number of left ends in it.

One of the things I'd pictured doing with multiranges was along the
lines of a "full coverage" constraint like "During a shift, there can
be no interval that's not covered," which would correspond to a "range
cardinality" of 1.

I confess I'm getting a little twitchy about the idea of eliding the
cases of "one" and "many", though.

> Assuming that that's ok, it seems like we could consider the traditional
> range functions like lower() and upper() to report on the first or last
> range bound in a multirange --- essentially, they ignore any "holes"
> that exist inside the range.  And the new functions for multiranges
> act much like array slicing, in that they give you back pieces of a range
> that aren't actually of a distinct type.

So new functions along the lines of lowers(), uppers(), opennesses(),
etc.? I guess this could be extended as needs emerge.

There's another use case not yet covered here that could make this
even more complex, we should probably plan for it: multi-ranges with
weights.

For example,

SELECT weighted_range_union(r)
FROM (VALUES('[0,1)'::float8range), ('[0,3)'), '('[2,5)')) AS t(r)

would yield something along the lines of:

(([0,1),1), ([1,3),2), ([3,5),1))

and wedging that into the range type seems messy. Each range would
then have a cardinality, and each range within would have a weight,
all of which would be an increasingly heavy burden on the common case
where there's just a single range.

Enhancing a separate multirange type to have weights seems like a
cleaner path forward.

Given that, I'm -1 on mushing multi-ranges into a special case of
ranges, or /vice versa/.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ALTER TEXT SEARCH DICTIONARY tab completion
Next
From: James Coleman
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)