bloom filter indexes? - Mailing list pgsql-general

From Mason Hale
Subject bloom filter indexes?
Date
Msg-id 8bca3aa10806030743h24a18d4dq334cf91f66aa635d@mail.gmail.com
Whole thread Raw
Responses Re: bloom filter indexes?  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-general
I've been working on partitioning a rather large dataset into multiple
tables. One limitation I've run into the lack of cross-partition-table
unique indexes. In my case I need to guarantee the uniqueness of a
two-column pair across all partitions -- and this value is not used to
partition the tables. The table is partitioned based on a insert date
timestamp.

To check the uniqueness of this value I've added an insert/update
trigger to search for matches in the other partitions. This trigger is
adding significant overhead to inserts and updates.

This sort of 'membership test' where I need only need to know if the
key exists in the table is a perfect match for bloom filter. (see:
http://en.wikipedia.org/wiki/Bloom_filter).

The Bloom filter can give false positives so using it alone won't
provide the uniqueness check I need, but it should greatly speed up
this process.

Searching around for "postgresql bloom filter" I found this message
from 2005 along the same lines:
http://archives.postgresql.org/pgsql-hackers/2005-05/msg01475.php

This thread indicates bloom filters are used in the intarray contrib
module and the tsearch2 (and I assume the built-in 8.3 full-text
search features).

I also found this assignment for CS course at the University of
Toronto, when entails using bloom filters to speed up large joins:
http://queens.db.toronto.edu/~koudas/courses/cscd43/hw2.pdf

So, my question: are there any general-purpose bloom filter
implementations for postgresql? I'm particularly interested
implementations that would be useful for partitioned tables. Is anyone
working on something like this?

thanks,
- Mason

pgsql-general by date:

Previous
From: Stephan Szabo
Date:
Subject: Re: join ... using ... and - is this expected behaviour?
Next
From: "Brian A. Seklecki"
Date:
Subject: Re: [Xen-users] SNMP AgentX subagent for Xen