pgsql: BRIN bloom indexes - Mailing list pgsql-committers

From Tomas Vondra
Subject pgsql: BRIN bloom indexes
Date
Msg-id E1lPlha-0007Jv-Bq@gemulon.postgresql.org
Whole thread Raw
List pgsql-committers
BRIN bloom indexes

Adds a BRIN opclass using a Bloom filter to summarize the range. Indexes
using the new opclasses allow only equality queries (similar to hash
indexes), but that works fine for data like UUID, MAC addresses etc. for
which range queries are not very common. This also means the indexes
work for data that is not well correlated to physical location within
the table, or perhaps even entirely random (which is a common issue with
existing BRIN minmax opclasses).

It's possible to specify opclass parameters with the usual Bloom filter
parameters, i.e. the desired false-positive rate and the expected number
of distinct values per page range.

  CREATE TABLE t (a int);
  CREATE INDEX ON t
   USING brin (a int4_bloom_ops(false_positive_rate = 0.05,
                                n_distinct_per_range = 100));

The opclasses do not operate on the indexed values directly, but compute
a 32-bit hash first, and the Bloom filter is built on the hash value.
Collisions should not be a huge issue though, as the number of distinct
values in a page ranges is usually fairly small.

Bump catversion, due to various catalog changes.

Author: Tomas Vondra <tomas.vondra@postgresql.org>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Sokolov Yura <y.sokolov@postgrespro.ru>
Reviewed-by: Nico Williams <nico@cryptonector.com>
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com
Discussion: https://postgr.es/m/5d78b774-7e9c-c94e-12cf-fef51cc89b1a%402ndquadrant.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/77b88cd1bb9041a735f24072150cacfa06c699a3

Modified Files
--------------
doc/src/sgml/brin.sgml                    | 226 ++++++++-
src/backend/access/brin/Makefile          |   1 +
src/backend/access/brin/brin_bloom.c      | 809 ++++++++++++++++++++++++++++++
src/include/catalog/catversion.h          |   2 +-
src/include/catalog/pg_amop.dat           | 116 +++++
src/include/catalog/pg_amproc.dat         | 447 +++++++++++++++++
src/include/catalog/pg_opclass.dat        |  72 +++
src/include/catalog/pg_opfamily.dat       |  38 ++
src/include/catalog/pg_proc.dat           |  34 ++
src/include/catalog/pg_type.dat           |   7 +-
src/test/regress/expected/brin_bloom.out  | 428 ++++++++++++++++
src/test/regress/expected/opr_sanity.out  |   3 +-
src/test/regress/expected/psql.out        |   3 +-
src/test/regress/expected/type_sanity.out |   7 +-
src/test/regress/parallel_schedule        |   5 +
src/test/regress/serial_schedule          |   1 +
src/test/regress/sql/brin_bloom.sql       | 376 ++++++++++++++
17 files changed, 2567 insertions(+), 8 deletions(-)


pgsql-committers by date:

Previous
From: Tomas Vondra
Date:
Subject: pgsql: Support the old signature of BRIN consistent function
Next
From: Tomas Vondra
Date:
Subject: pgsql: BRIN minmax-multi indexes