Re: range_agg - Mailing list pgsql-hackers

From David Fetter
Subject Re: range_agg
Date
Msg-id 20190709155112.GH24679@fetter.org
Whole thread Raw
In response to Re: range_agg  (Paul A Jungwirth <pj@illuminatedcomputing.com>)
Responses Re: range_agg  (Paul A Jungwirth <pj@illuminatedcomputing.com>)
List pgsql-hackers
On Mon, Jul 08, 2019 at 09:46:44AM -0700, Paul A Jungwirth wrote:
> On Sat, Jul 6, 2019 at 12:13 PM Jeff Davis <pgsql@j-davis.com> wrote:
> >
> > On Fri, 2019-07-05 at 09:58 -0700, Paul A Jungwirth wrote:
> > > user-defined range types. So how about I start on it and see how it
> > > goes? I expect I can follow the existing code for range types pretty
> > > closely, so maybe it won't be too hard.
> >
> > That would be great to at least take a look. If it starts to look like
> > a bad idea, then we can re-evaluate and see if it's better to just
> > return arrays.
> 
> I made some progress over the weekend. I don't have a patch yet but I
> thought I'd ask for opinions on the approach I'm taking:
> 
> - A multirange type is an extra thing you get when you define a range
> (just like how you get a tstzrange[]). Therefore....
> - I don't need separate commands to add/drop multirange types. You get
> one when you define a range type, and if you drop a range type it gets
> dropped automatically.

Yay for fewer manual steps!

> - I'm adding a new typtype for multiranges. ('m' in pg_type).
> - I'm just adding a mltrngtypeid column to pg_range. I don't think I
> need a new pg_multirange table.

Makes sense, as they'd no longer be separate concepts.

> - You can have a multirange[].

I can see how that would fall out of this, but I'm a little puzzled as
to what people might use it for. Aggregates, maybe?

> - Multirange in/out work just like arrays, e.g. '{"[1,3)", "[5,6)"}'
> - I'll add an anymultirange pseudotype. When it's the return type of a
> function that has an "anyrange" parameter, it will use the same base
> element type. (So basically anymultirange : anyrange :: anyarray ::
> anyelement.)

Neat!

> - You can cast from a multirange to an array. The individual ranges
> are always sorted in the result array.

Is this so people can pick individual ranges out of the multirange,
or...? Speaking of casts, it's possible that a multirange is also a
range. Would it make sense to have a cast from multirange to range?

> - You can cast from an array to a multirange but it will error if
> there are overlaps (or not?).

An alternative would be to canonicalize into non-overlapping ranges.
There's some precedent for this in casts to JSONB. Maybe a function
that isn't a cast should handle such things.

> The array's ranges don't have to be sorted but they will be after a
> "round trip".

Makes sense.

> - Interesting functions:
>   - multirange_length

Is that the sum of the lengths of the ranges?  Are we guaranteeing a
measure in addition to ordering on ranges now?

>   - range_agg (range_union_agg if you like)
>   - range_intersection_agg
> - You can subscript a multirange like you do an array (? This could be
> a function instead.)

How would this play with the generic subscripting patch in flight?

> - operators:
>   - union (++) and intersection (*):
>     - We already have + for range union but it raises an error if
> there is a gap, so ++ is the same but with no errors.
>     - r ++ r = mr (commutative, associative)
>     - mr ++ r = mr
>     - r ++ mr = mr
>     - r * r = r (unchanged)
>     - mr * r = r
>     - r * mr = r

Shouldn't the two above both yield multirange ? For example, if I
understand correctly,

{"[1,3)","[5,7)"} * [2,6) should be {"[2,3)","[5,6)"}

>     - mr - r = mr
>     - r - mr = mr
>     - mr - mr = mr
>   - comparison
>     - mr = mr
>     - mr @> x

x is in the domain of the (multi)range?

>     - mr @> r
>     - mr @> mr
>     - x <@ mr
>     - r <@ mr
>     - mr <@ mr
>     - mr << mr (strictly left of)
>     - mr >> mr (strictly right of)
>     - mr &< mr (does not extend to the right of)
>     - mr &> mr (does not extend to the left of)
>   - inverse operator?:
>     - the inverse of {"[1,2)"} would be {"[null, 1)", "[2, null)"}.

Is that the same as ["(∞, ∞)"] - {"[1,2)"}? I seem to recall that the
usual convention (at least in math) is to use intervals that are
generally represented as open on the infinity side, but that might not
fit how we do things.

>     - not sure we want this or what the symbol should be. I don't like
> -mr as an inverse because then mr - mr != mr ++ -mr.

!mr , perhaps?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Next
From: Justin Pryzby
Date:
Subject: Re: clean up docs for v12