Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: WIP: BRIN multi-range indexes
Date
Msg-id 20200804151743.dbobvzovtnpgfpn7@development
Whole thread Raw
In response to Re: WIP: BRIN multi-range indexes  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: WIP: BRIN multi-range indexes
List pgsql-hackers
On Tue, Aug 04, 2020 at 05:36:51PM +0300, Alexander Korotkov wrote:
>Hi, Tomas!
>
>Sorry for the late reply.
>
>On Sun, Jul 19, 2020 at 6:19 PM Tomas Vondra
><tomas.vondra@2ndquadrant.com> wrote:
>> I think there's a number of weak points in this approach.
>>
>> Firstly, it assumes the summaries can be represented as arrays of
>> built-in types, which I'm not really sure about. It clearly is not true
>> for the bloom opclasses, for example. But even for minmax oclasses it's
>> going to be tricky because the ranges may be on different data types so
>> presumably we'd need somewhat nested data structure.
>>
>> Moreover, multi-minmax summary contains either points or intervals,
>> which requires additional fields/flags to indicate that. That further
>> complicates the things ...
>>
>> maybe we could decompose that into separate arrays or something, but
>> honestly it seems somewhat premature - there are far more important
>> aspects to discuss, I think (e.g. how the ranges are built/merged in
>> multi-minmax, or whether bloom opclasses are useful at all).
>
>I see.  But there is at least a second option to introduce a new
>datatype with just an output function.  In the similar way
>gist/tsvector_ops uses gtsvector key type.  I think it would be more
>transparent than using just bytea.  Also, this is the way we already
>use in the core.
>

So you're proposing to have a new data types "brin_minmax_multi_summary"
and "brin_bloom_summary" (or some other names), with output functions
printing something nicer? I suppose that could work, and we could even
add pageinspect functions returning the value as raw bytea.

Good idea!

>> >BTW, I've applied the patchset to the current master, but I got a lot
>> >of duplicate oids.  Could you please resolve these conflicts.  I think
>> >it would be good to use high oid numbers to evade conflicts during
>> >development/review, and rely on committer to set final oids (as
>> >discussed in [1]).
>> >
>> >Links
>> >1. https://www.postgresql.org/message-id/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-%3DeaochT0dd2BN9CQ%40mail.gmail.com
>>
>> Did you use the patchset from 2020/07/03? I don't get any duplicate OIDs
>> with it, and it's already using quite high OIDs (part 4 uses >= 8000,
>> part 5 uses >= 9000).
>
>Yep, it appears that I was using the wrong version of patchset.
>Patchset from 2020/07/03 works good on the current master.
>

OK, good.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: LSM tree for Postgres
Next
From: Stephen Frost
Date:
Subject: Re: LSM tree for Postgres