Re: BRIN indexes - TRAP: BadArgument - Mailing list pgsql-hackers

From Greg Stark
Subject Re: BRIN indexes - TRAP: BadArgument
Date
Msg-id CAM-w4HNEFvDpiSmvRy5-X=nPnTJaM2HQPTFUYF7Rw4-zxiXboQ@mail.gmail.com
Whole thread Raw
In response to Re: BRIN indexes - TRAP: BadArgument  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: BRIN indexes - TRAP: BadArgument  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Mon, Nov 10, 2014 at 9:31 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Every time the index is accessed, yeah.  I'm not sure about figuring the
> initial creation details.  Do you think we need another support
> procedure to help with that?  We can add it if needed; minmax would just
> define it to InvalidOid.

I have a working bloom filter with hard coded filter size and hard
coded number of hash functions. I need to think about how I'm going to
make it more general now. I think the answer is that I should have an
index option that specifies the false positive rate and calculates the
optimal filter size and number of hash functions. It might possibly
need to peek at the table statistics to determine the population size
though. Or perhaps I should bite the bullet and size the bloom filters
based on the actual number of rows in a chunk since the BRIN
infrastructure does allow each summary to be a different size.

There's another API question I have. To implement Consistent I need to
call the hash function which in the case of functions like hashtext
could be fairly expensive and I even need to generate multiple hash
values(though currently I'm slicing them all from the integer hash
value so that's not too bad) and then test each of those bits. It
would be natural to call hashtext once at the start of the scan and
possibly build a bitmap and compare all of them in a single &
operation. But afaict there's no way to hook the beginning of the scan
and opaque is not associated with the specific scan so I don't think I
can cache the hash value of the scan key there safely. Is there a good
way to do it with the current API?

On a side note I'm curious about something, I was stepping through the
my code in gdb and discovered that a single row insert appeared to
construct a new summary then union it into the existing summary
instead of just calling AddValue on the existing summary. Is that
intentional? What led to that?

-- 
greg



pgsql-hackers by date:

Previous
From: Christopher Browne
Date:
Subject: Re: Add CREATE support to event triggers
Next
From: Fabrízio de Royes Mello
Date:
Subject: Re: [GSoC2014] Patch ALTER TABLE ... SET LOGGED