Re: range_agg - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: range_agg
Date
Msg-id CAPpHfduQ3iwkOH+myncE7Gb=dAn9vM=KrH2oALN0LQzi+XPOqA@mail.gmail.com
Whole thread Raw
In response to Re: range_agg  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: range_agg
List pgsql-hackers
On Tue, Dec 8, 2020 at 3:20 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> I'd like to publish my revision of the patch.  So Paul could start
> from it.  The changes I made are minor
> 1. Add missing types to typedefs.list
> 2. Run pg_indent run over the changed files and some other formatting changes
> 3. Reorder the regression tests to evade the error spotted by
> commitfest.cputube.org
>
> I'm switching this patch to WOA.

I decided to work on this patch myself.  The next revision is attached.

The changes are as follows.

1. CREATE TYPE ... AS RANGE command now accepts new argument
multirange_type_name.  If multirange_type_name isn't specified, then
multirange type name is selected automatically.  pg_dump always
specifies multirange_type_name (if dumping at least pg14).  Thanks to
that dumps are always restorable.
2. Multiranges now have a new binary format.  After the MultirangeType
struct, an array of offsets comes, then an array of flags and finally
bounds themselves.  Offsets points to the bounds of particular range
within multirange.  Thanks to that particular range could be accessed
by number without deserialization of the whole multirange.  Offsets
are stored in compression-friendly format similar to jsonb (actually
only every 4th of those "offsets" is really offsets, others are
lengths).
3. Most of simple functions working with multirages now don't
deserialize the whole multirange.  Instead they fetch bounds of
particular ranges, and that doesn't even require any additional memory
allocation.
4. I've removed ExpandedObject support from the patch.  I don't see
much point in it assuming all the functions are returning serialized
multirage anyway.  We can add ExpandedObject support in future if
needed.
5. multirange_contains_element(), multirange_contains_range(),
multirange_overlaps_range() now use binary search.  Thanks to binary
format, which doesn't require full deserialization, these functions
now work with O(log N) complexity.

Comments and documentation still need revision according to these
changes.  I'm going to continue with this.

------
Regards,
Alexander Korotkov

Attachment

pgsql-hackers by date:

Previous
From: Alastair Turner
Date:
Subject: Re: Proposed patch for key managment
Next
From: Tom Lane
Date:
Subject: Re: Sorting case branches in outfuncs.c/outNode alphabetically