Thread: pgsql: Bloom index contrib module
Bloom index contrib module Module provides new access method. It is actually a simple Bloom filter implemented as pgsql's index. It could give some benefits on search with large number of columns. Module is a single way to test generic WAL interface committed earlier. Author: Teodor Sigaev, Alexander Korotkov Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby Branch ------ master Details ------- http://git.postgresql.org/pg/commitdiff/9ee014fc899a28a198492b074e32b60ed8915ea9 Modified Files -------------- contrib/Makefile | 1 + contrib/bloom/.gitignore | 4 + contrib/bloom/Makefile | 24 ++ contrib/bloom/blcost.c | 48 ++++ contrib/bloom/blinsert.c | 313 ++++++++++++++++++++++++++ contrib/bloom/bloom--1.0.sql | 19 ++ contrib/bloom/bloom.control | 5 + contrib/bloom/bloom.h | 178 +++++++++++++++ contrib/bloom/blscan.c | 175 +++++++++++++++ contrib/bloom/blutils.c | 463 +++++++++++++++++++++++++++++++++++++++ contrib/bloom/blvacuum.c | 212 ++++++++++++++++++ contrib/bloom/blvalidate.c | 220 +++++++++++++++++++ contrib/bloom/expected/bloom.out | 122 +++++++++++ contrib/bloom/sql/bloom.sql | 47 ++++ contrib/bloom/t/001_wal.pl | 75 +++++++ doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++ doc/src/sgml/contrib.sgml | 1 + doc/src/sgml/filelist.sgml | 1 + 18 files changed, 2126 insertions(+)
On 2016-04-01 15:49, Teodor Sigaev wrote: > Bloom index contrib module > > Module provides new access method. It is actually a simple Bloom filter > implemented as pgsql's index. It could give some benefits on search > with large number of columns. > > doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++ I edited the bloom.sgml text a bit. Great stuff, thanks! Erik Rijkers
Attachment
Several non-x86 members of pgbuildfarm aren't happy with it, we are investigating the problem Teodor Sigaev wrote: > Bloom index contrib module > > Module provides new access method. It is actually a simple Bloom filter > implemented as pgsql's index. It could give some benefits on search > with large number of columns. > > Module is a single way to test generic WAL interface committed earlier. > > Author: Teodor Sigaev, Alexander Korotkov > Reviewers: Aleksander Alekseev, Michael Paquier, Jim Nasby > > Branch > ------ > master > > Details > ------- > http://git.postgresql.org/pg/commitdiff/9ee014fc899a28a198492b074e32b60ed8915ea9 > > Modified Files > -------------- > contrib/Makefile | 1 + > contrib/bloom/.gitignore | 4 + > contrib/bloom/Makefile | 24 ++ > contrib/bloom/blcost.c | 48 ++++ > contrib/bloom/blinsert.c | 313 ++++++++++++++++++++++++++ > contrib/bloom/bloom--1.0.sql | 19 ++ > contrib/bloom/bloom.control | 5 + > contrib/bloom/bloom.h | 178 +++++++++++++++ > contrib/bloom/blscan.c | 175 +++++++++++++++ > contrib/bloom/blutils.c | 463 +++++++++++++++++++++++++++++++++++++++ > contrib/bloom/blvacuum.c | 212 ++++++++++++++++++ > contrib/bloom/blvalidate.c | 220 +++++++++++++++++++ > contrib/bloom/expected/bloom.out | 122 +++++++++++ > contrib/bloom/sql/bloom.sql | 47 ++++ > contrib/bloom/t/001_wal.pl | 75 +++++++ > doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++ > doc/src/sgml/contrib.sgml | 1 + > doc/src/sgml/filelist.sgml | 1 + > 18 files changed, 2126 insertions(+) > > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
Teodor Sigaev <teodor@sigaev.ru> writes: > Bloom index contrib module skink provided some pretty suggestive evidence about why this is unstable: ==32446== VALGRINDERROR-BEGIN ==32446== Conditional jump or move depends on uninitialised value(s) ==32446== at 0x4E2E71: writeDelta (generic_xlog.c:137) ==32446== by 0x4E341E: GenericXLogFinish (generic_xlog.c:313) ==32446== by 0x14E83324: blbulkdelete (blvacuum.c:149) ==32446== by 0x4BCEE7: index_bulk_delete (indexam.c:627) ==32446== by 0x5DE577: lazy_vacuum_index (vacuumlazy.c:1581) ==32446== by 0x5DFB52: lazy_scan_heap (vacuumlazy.c:1273) ==32446== by 0x5E03AA: lazy_vacuum_rel (vacuumlazy.c:249) ==32446== by 0x5DC7B7: vacuum_rel (vacuum.c:1375) ==32446== by 0x5DD5F7: vacuum (vacuum.c:296) ==32446== by 0x693B71: autovacuum_do_vac_analyze (autovacuum.c:2807) ==32446== by 0x695B2A: do_autovacuum (autovacuum.c:2328) ==32446== by 0x696055: AutoVacWorkerMain (autovacuum.c:1647) ==32446== Uninitialised value was created by a stack allocation ==32446== at 0x14E82CAB: blbulkdelete (blvacuum.c:36) ==32446== ==32446== VALGRINDERROR-END regards, tom lane
On 2016-04-01 14:36, Erik Rijkers wrote: > On 2016-04-01 15:49, Teodor Sigaev wrote: >> Bloom index contrib module >> >> doc/src/sgml/bloom.sgml | 218 ++++++++++++++++++ > The size of example table (in bloom.sgml): CREATE TABLE tbloom AS SELECT random()::int as i1, random()::int as i2, [...] random()::int as i12, random()::int as i13 FROM generate_series(1,1000); seems too small to demonstrate the index-use. For me, both on $BigServer at work as on $ModestDesktop at home the 1000 rows are not enough. I suggest making the rowcount in that example a larger, for instance 10000, so: generate_series(1,10000). Does that make sense? I realize the behavior is probably somewhat dependent from hardware and settings... thanks, Erik Rijkers