Re: [HACKERS] WIP: BRIN bloom indexes - Mailing list pgsql-hackers
From | Nico Williams |
---|---|
Subject | Re: [HACKERS] WIP: BRIN bloom indexes |
Date | |
Msg-id | 20171027171712.GJ4496@localhost Whole thread Raw |
In response to | [HACKERS] WIP: BRIN bloom indexes (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [HACKERS] WIP: BRIN bloom indexes
Re: [HACKERS] WIP: BRIN bloom indexes |
List | pgsql-hackers |
On Thu, Oct 19, 2017 at 10:15:32PM +0200, Tomas Vondra wrote: A bloom filter index would, indeed, be wonderful. Comments: + * We use an optimisation that initially we store the uint32 values directly, + * without the extra hashing step. And only later filling the bitmap space, + * we switch to the regular bloom filter mode. I don't think that optimization is worthwhile. If I'm going to be using a Bloom filter, it's because I expect to have a lot of rows. (Granted, if I CREATE TABLE .. LIKE .. INCLUDING INDEXES, and the new table just won't have lots of rows, then I might want this optimization, but I can always just drop the Bloom filter index, or not include indexes in the LIKE.) + * XXX Perhaps we could save a few bytes by using different data types, but + * considering the size of the bitmap, the difference is negligible. A bytea is all that's needed. See below. + * XXX We could also implement "sparse" bloom filters, keeping only the + * bytes that are not entirely 0. That might make the "sorted" phase + * mostly unnecessary. Filter compression is not worthwhile. You want to have a fairly uniform hash distribution, and you want to end up with a Bloom filter that's not much more than 50% full. That means that when near 50% full the filter will not be very sparse at all. Optimizing for the not common case doesn't seem worthwhile, and adds complexity. + * XXX We can also watch the number of bits set in the bloom filter, and + * then stop using it (and not store the bitmap, to save space) when the + * false positive rate gets too high. Ah, right, what you want is a "Scalable Bloom Filter". A Scalable Bloom filter is actually a series of Bloom filters where all but the newest filter are closed to additions, and the newest filter is where you do all the additions. You generally want to make each new filter bigger than the preceding one because when searching a Scalable Bloom filter you have to search *all* of them, so you want to minimize the number of filters. Eventually, when you have enough sub-filters, you may want to re-write the whole thing so that you start with a single sub-filter that is large enough. The format of the bytea might then be something like: <size><bitmap>[[<size><bitmap>[...]] where the last bitmap is the filter that is open to additions. I wonder if there are write concurrency performance considerations here... It might be better to have a bytea value per-sub-filter so that there is no lock contention for the closed filters. The closed sub-filters are constant, thus not even shared locks are needed for them, and especially not exclusive locks. Writing to the filter will necessitate locking the entire open filter, or else byte-range locking on it. Something to think about. > Now, what about query performance? Unlike the "minmax" indexes, the > "bloom" filter can only handle equality conditions. A Bloom filter has non-zero false positives for existence, but zero false positives for non-existence. This means that a Bloom filter index can only be used for: a) non-existence tests (with equality predicates, as you point out), b) as an optimization to avoid slower index checks (or heap scans) when the Bloom filter indicates non-existence. (b) is really just an internal application of (a). There might be applications where false positives might be ok in a query like: SELECT a.* FROM a a JOIN b b USING (some_col); but for most real-world queries like that, allowing false positives by default would be very bad. An option in the index declaration could be used to enable existence equality predicates, but I would not include such an option initially -- let's see if anyone actually has a use case for it first. Of course, for something like: SELECT a.*, b.* FROM a a JOIN b b USING (some_col); a Bloom filter can only be used as an optimization to avoid using a slower index (or heap scan) on the inner table source. What I'm getting at is that the query planner really needs to understand that a Bloom filter is a probabilistic data structure. Nico -- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: