Home > mailing lists

Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: WIP: BRIN multi-range indexes
Date	July 19, 2020 15:19:45
Msg-id	20200719151945.p6iaexethuplxxic@development Whole thread
In response to	Re: WIP: BRIN multi-range indexes (Alexander Korotkov <aekorotkov@gmail.com>)
Responses	Re: WIP: BRIN multi-range indexes
List	pgsql-hackers

Tree view

On Wed, Jul 15, 2020 at 05:34:05AM +0300, Alexander Korotkov wrote:
>On Mon, Jul 13, 2020 at 5:59 PM Tomas Vondra
><tomas.vondra@2ndquadrant.com> wrote:
>> >> > If we really want to print something nicer, I'd say it needs to be a
>> >> > special function in the BRIN opclass.
>> >>
>> >> If that can be done, then +1.  We just need to ensure that the function
>> >> knows and can verify the type of index that the value comes from.  I
>> >> guess we can pass the index OID so that it can extract the opclass from
>> >> catalogs to verify.
>> >
>> >+1 from me, too. Perhaps we can have it as optional. If a BRIN opclass
>> >doesn't have it, the 'values' can be null.
>> >
>>
>> I'd say that if the opclass does not have it, then we should print the
>> bytea value (or whatever the opclass uses to store the summary) using
>> the type functions.
>
>I've read the recent messages in this thread and I'd like to share my thoughts.
>
>I think the way brin_page_items() displays values is not really
>generic.  It uses a range-like textual representation of an array of
>values, while that array doesn't necessarily have range semantics.
>
>However, I think it's good that brin_page_items() uses a type output
>function to display values.  So, it's not necessary to introduce a new
>BRIN opclass function in order to get values displayed in a
>human-readable way.  Instead, we could just make a standard of BRIN
>value to be human readable.  I see at least two possibilities for
>that.
>1. Use standard container data-types to represent BRIN values.  For
>instance we could use an array of ranges instead of bytea for
>multirange.  Not about how convenient/performant it would be.
>2. Introduce new data-type to represent values in BRIN index. And for
>that type we can define output function with user-readable output. We
>did similar things for GiST.  For instance, pg_trgm defines gtrgm
>type, which has no input and no output. But for BRIN opclass we can
>define type with just output.
>

I think there's a number of weak points in this approach.

Firstly, it assumes the summaries can be represented as arrays of
built-in types, which I'm not really sure about. It clearly is not true
for the bloom opclasses, for example. But even for minmax oclasses it's
going to be tricky because the ranges may be on different data types so
presumably we'd need somewhat nested data structure.

Moreover, multi-minmax summary contains either points or intervals,
which requires additional fields/flags to indicate that. That further
complicates the things ...

maybe we could decompose that into separate arrays or something, but
honestly it seems somewhat premature - there are far more important
aspects to discuss, I think (e.g. how the ranges are built/merged in
multi-minmax, or whether bloom opclasses are useful at all).


>BTW, I've applied the patchset to the current master, but I got a lot
>of duplicate oids.  Could you please resolve these conflicts.  I think
>it would be good to use high oid numbers to evade conflicts during
>development/review, and rely on committer to set final oids (as
>discussed in [1]).
>
>Links
>1. https://www.postgresql.org/message-id/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-%3DeaochT0dd2BN9CQ%40mail.gmail.com
>

Did you use the patchset from 2020/07/03? I don't get any duplicate OIDs
with it, and it's already using quite high OIDs (part 4 uses >= 8000,
part 5 uses >= 9000).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Tom Lane
Date: 19 July 2020, 14:43:49
Subject: Re: Default setting for enable_hashagg_disk

From: "David G. Johnston"
Date: 19 July 2020, 16:23:41
Subject: Re: Default setting for enable_hashagg_disk

Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

Previous

Next