Re: WIP: BRIN multi-range indexes - Mailing list pgsql-hackers

From John Naylor
Subject Re: WIP: BRIN multi-range indexes
Date
Msg-id CACPNZCtvdOjYvYMj2y1+gbVhy_3udJR8iJRo4zkQFg=NT3_9pA@mail.gmail.com
Whole thread Raw
In response to Re: WIP: BRIN multi-range indexes  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: WIP: BRIN multi-range indexes  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Thu, Sep 24, 2020 at 7:50 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Thu, Sep 24, 2020 at 05:18:03PM -0400, John Naylor wrote:

> >Hmm, how ugly would it be to change the default range size depending
> >on the opclass?
> >
>
> Not sure. What would happen for multi-column BRIN indexes with different
> opclasses?

Sounds like a can of worms. In any case I suspect if there is no more
graceful way to handle too-large filters than ERROR out the first time
trying to write to the index, this feature might meet some resistance.
Not sure what to suggest, though.

> >> I don't think the efficiency of this code matters too much - it's only
> >> used once when creating the index, so the simpler the better. Certainly
> >> for now, while testing the partitioning approach.
> >
> >To check my understanding, isn't bloom_init() called for every tuple?
> >Agreed on simplicity so done this way.
> >
>
> No, it's only called for the first non-NULL value in the page range
> (unless I made a boo boo when writing that code).

Ok, then I basically understood -- by tuple I meant BRIN tuple, pardon
my ambiguity. After thinking about it, I agree that CPU cost is
probably trivial (and if not, something is seriously wrong).

> >n    k   m      p
> >928   7  8895   0.01
> >928  10  13343  0.001  (lowest p supported in patch set)
> >928  13  17790  0.0001
> >928  10  18280  0.0001 (still works with lower k, needs higher m)
> >928  10  17790  0.00012 (even keeping m from row #3, capping k doesn't
> >degrade p much)
> >
> >Also, k seems pretty robust against small changes as long as m isn't
> >artificially constrained and as long as p is small.
> >
> >So I *think* it's okay to cap k at 10 or 12, and not bother with
> >adjusting m, which worsens space issues. As I found before, lowering k
> >raises target fpr, but seems more robust to overshooting ndistinct. In
> >any case, we only need k * 2 bytes to store the partition lengths.
> >
> >The only way I see to avoid any limitation is to make the array of
> >primes variable length, which could be done by putting the filter
> >offset calculation into a macro. But having two variable-length arrays
> >seems messy.
> >
>
> Hmmm. I wonder how would these limitations impact the conclusions from
> the one-hashing paper? Or was this just for the sake of a demonstration?

Using "10" in the patch is a demonstration, which completely supports
the current fpr allowed by the reloption, and showing what happens if
fpr is allowed to go lower. But for your question, I *think* this
consideration is independent from the conclusions. The n, k, m values
give a theoretical false positive rate, assuming a completely perfect
hashing scheme. The numbers I'm playing with show consequences in the
theoretical fpr. The point of the paper (and others like it) is how to
get the real fpr as close as possible to the fpr predicted by the
theory. My understanding anyway.

-- 
John Naylor                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Asynchronous Append on postgres_fdw nodes.
Next
From: Vladimir Sitnikov
Date:
Subject: Re: BLOB / CLOB support in PostgreSQL