Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

From John Naylor
Subject Re: WIP: BRIN multi-range indexes
Date
Msg-id CAFBsxsF_nTeG8EX9gUkkpiAHOpdU_OEKbvsK3BzeMSdJbERbjg@mail.gmail.com
Whole thread Raw
In response to Re: WIP: BRIN multi-range indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: WIP: BRIN multi-range indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
On Wed, Feb 3, 2021 at 7:54 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
>
> [v-20210203]

Hi Tomas,

I have some random comments from reading the patch, but haven't gone into detail in the newer aspects. I'll do so in the near future.

The cfbot seems to crash on this patch during make check, but it doesn't crash for me. I'm not even sure what date that cfbot status is from.

> BLOOM
> -----

Looks good, but make sure you change the commit message -- it still refers to sorted mode.

+ * not entirely clear how to distrubute the space between those columns.

s/distrubute/distribute/

> MINMAX-MULTI
> ------------

> c) 0007 - A hybrid approach, using a buffer that is multiple of the
> user-specified value, with some safety min/max limits. IMO this is what
> we should use, although perhaps with some tuning of the exact limits.

That seems like a good approach.

+#include "access/hash.h" /* XXX strange that it fails because of BRIN_AM_OID without this */

I think you want #include "catalog/pg_am.h" here.

> Attached is a spreadsheet with benchmark results for each of those three
> approaches, on different data types (byval/byref), data set types, index
> parameters (pages/values per range) etc. I think 0007 is a reasonable
> compromise overall, with performance somewhere in betwen 0005 and 0006.
> Of course, there are cases where it's somewhat slow, e.g. for data types
> with expensive comparisons and data sets forcing frequent compactions,
> in which case it's ~10x slower compared to regular minmax (in most cases
> it's ~1.5x). Compared to btree, it's usually much faster - ~2-3x as fast
> (except for some extreme cases, of course).
>
>
> As for the opclasses for indexes without "natural" distance function,
> implemented in 0008, I think we should drop it. In theory it works, but

Sounds reasonable.

> The other thing we were considering was using the new multi-minmax
> opclasses as default ones, replacing the existing minmax ones. IMHO we
> shouldn't do that either. For existing minmax indexes that's useless
> (the opclass seems to be working, otherwise the index would be dropped).
> But even for new indexes I'm not sure it's the right thing, so I don't
> plan to change this.

Okay.

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: torikoshia
Date:
Subject: Re: adding wait_start column to pg_locks
Next
From: "Tang, Haiying"
Date:
Subject: RE: Support tab completion for upper character inputs in psql