Thread: Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

From

"Kevin Grittner"

Date:

18 September 2011, 15:17:31

Stefan Keller  wrote:

> It's hard for me to imagine that btree is superior for all the
> issues mentioned before.

It would be great if you could show a benchmark technique which shows
otherwise.

-Kevin

Re: Hash index use presently(?) discouraged since 2005: revive or bury it?

From

Stefan Keller

Date:

18 September 2011, 19:32:04

I'm simply referring to literature (like the intro Ramakrishnan & Gehrke).
I just know that Oracle an Mysql actually do have them too and use it
without those current implementation specific restrictions in
Postgres.

IMHO by design Hash Index (e.g. linear hashing) work best when:
1. only equal (=) tests are used (on whole values)
2. columns (key values) have very-high cardinality

And ideally but not necessarily when index values do not change and
number of rows are known ahead of time (avoiding O(N) worst case - but
there are approaches to chaining with dynamic resizing).

I just collected this to encourage ourselves that enhancing hash
indexes could be worthwhile.

Stefan

2011/9/18 Kevin Grittner <Kevin.Grittner@wicourts.gov>:
> Stefan Keller  wrote:
>
>> It's hard for me to imagine that btree is superior for all the
>> issues mentioned before.
>
> It would be great if you could show a benchmark technique which shows
> otherwise.
>
> -Kevin
>

How to make hash indexes fast

From

Craig James

Date:

19 September 2011, 01:17:03

Regarding the recent discussion about hash versus B-trees: Here is a trick I invented years ago to make hash indexes
REALLYfast. It eliminates the need for a second disk access to check the data in almost all cases, at the cost of an
additional32-bit integer in the hash-table data structure.

Using this technique, we were able to load a hash-indexed database with data transfer rates that matched a cp (copy)
commandof the same data on Solaris, HP-UX and IBM AIX systems.

You build a normal hash table with hash-collision chains. But you add another 32-bit integer "signature" field to the
hash-collisionrecord (call it "DBsig"). You also create a function:

signature = sig(key)

that produces digital signature. The critical factor in the sig() function is that there is an average of 9 bits set
(i.e.it is somewhat "sparse" on bits).

DBsig for a hash-collision chain is always the bitwise OR of every record in that hash-collision chain. When you add a
recordto the hash table, you do a bitwise OR of its signature into the existing DBsig. If you delete a record, you
eraseDBsig and rebuild it by recomputing the signatures of each record in the hash-collision chain and ORing them
togetheragain.

That means that for any key K, if K is actually on the disk, then all of the bits of sig(K) are always set in the
hash-tablerecord's "DBsig". If any one bit in sig(K) isn't set in "DBsig", then K is not in the database and you don't
haveto do a disk access to verify it. More formally, if

sig(K) AND DBsig != sig(K)

then K is definitely not in the database.

A typical hash table implementation might operate with a hash table that's 50-90% full, which means that the majority
ofaccesses to a hash index will return a record and require a disk access to check whether the key K is actually in the
database. With the signature method, you can eliminate over 99.9% of these disk accesses -- you only have to access the
datawhen you actually want to read or update it. The hash table can usually fit easily in memory even for large
tables,so it is blazingly fast.

Furthermore, performance degrades gracefully as the hash table becomes overloaded. Since each signature has 9 bits
set,you can typically have 5-10 hash collisions (a lot of signatures ORed together in each record's DBsig) before the
false-positiverate of the signature test gets too high. As the hash table gets overloaded and needs to be resized, the
falsepositives increase gradually and performance decreases due to the resulting unnecessary disk fetches to check the
key. In the worst case (e.g. a hash table that's overloaded by a factor of 10 or more), performance degrades to what it
wouldbe without the signatures.

For much higher selectivity, you could use a 64-bit signatures and make the sig() set an average of 13 bits. You'd get
verygood selectivity even in a badly overloaded hash table (long hash-collision chains).

This technique was never patented, and it was disclosed at several user-group meetings back in the early 1990's, so
thereare no restrictions on its use. If this is of any use to anyone, maybe you could stick my name in the code
somewhere.

Craig James (the other Craig)

Re: How to make hash indexes fast

From

Ondrej Ivanič

Date:

19 September 2011, 01:25:23

Hi,

On 19 September 2011 11:14, Craig James <craig_james@emolecules.com> wrote:
> DBsig for a hash-collision chain is always the bitwise OR of every record in
> that hash-collision chain.  When you add a record to the hash table, you do
> a bitwise OR of its signature into the existing DBsig.  If you delete a
> record, you erase DBsig and rebuild it by recomputing the signatures of each
> record in the hash-collision chain and ORing them together again.

Sound like a Bloom filter [1] to me.

BTW, Does Postgres use Bloom filter anywhere?

[1] http://en.wikipedia.org/wiki/Bloom_filter

--
Ondrej Ivanic
(ondrej.ivanic@gmail.com)

Re: How to make hash indexes fast

From

Claudio Freire

Date:

19 September 2011, 01:27:31

2011/9/19 Ondrej Ivanič <ondrej.ivanic@gmail.com>:
> BTW, Does Postgres use Bloom filter anywhere?

I saw patches for at least in-memory bloom filters (for hash joins)
Not sure they're committed. I think so.

Re: How to make hash indexes fast

From

Jeff Janes

Date:

19 September 2011, 02:58:14

On Sun, Sep 18, 2011 at 6:14 PM, Craig James <craig_james@emolecules.com> wrote:
> Regarding the recent discussion about hash versus B-trees: Here is a trick I
> invented years ago to make hash indexes REALLY fast.  It eliminates the need
> for a second disk access to check the data in almost all cases, at the cost
> of an additional 32-bit integer in the hash-table data structure.

I don't see how that can work unless almost all of your queries return
zero rows.  If it returns rows, you have to go get those rows in order
to return them.  And you also have to go get them to determine if they
are currently visible to the current transaction.  This sounds like a
highly specialized data structure for a highly specialized situation.

> Using this technique, we were able to load a hash-indexed database with data
> transfer rates that matched a cp (copy) command of the same data on Solaris,
> HP-UX and IBM AIX systems.
>
> You build a normal hash table with hash-collision chains.  But you add
> another 32-bit integer "signature" field to the hash-collision record (call
> it "DBsig").  You also create a function:
>
>  signature = sig(key)
>
> that produces digital signature.  The critical factor in the sig() function
> is that there is an average of 9 bits set (i.e. it is somewhat "sparse" on
> bits).
>
> DBsig for a hash-collision chain is always the bitwise OR of every record in
> that hash-collision chain.  When you add a record to the hash table, you do
> a bitwise OR of its signature into the existing DBsig.  If you delete a
> record, you erase DBsig and rebuild it by recomputing the signatures of each
> record in the hash-collision chain and ORing them together again.

Since PG doesn't store the keys in the hash index, this would mean
visiting the table for every entry with the same hash code.

>
> That means that for any key K, if K is actually on the disk, then all of the
> bits of sig(K) are always set in the hash-table record's "DBsig".  If any
> one bit in sig(K) isn't set in "DBsig", then K is not in the database and
> you don't have to do a disk access to verify it.  More formally, if
>
>   sig(K) AND DBsig != sig(K)
>
> then K is definitely not in the database.
>
> A typical hash table implementation might operate with a hash table that's
> 50-90% full, which means that the majority of accesses to a hash index will
> return a record and require a disk access to check whether the key K is
> actually in the database.

PG hash indexes do not typically operate at anywhere near that full,
because PG stores the entire 32 bit hash value.  Even if there are
only 8 buckets, and so only the bottom 3 bits are used to identify the
bucket, once in the bucket all 32 bits are inspected for collisions.
So on tables with less than several hundred million, collisions are
rare except for identical keys or malicious keys.  And if you want
tables much larger than that, you should probably go whole hog and
switch over to 64 bit.

So the fullness of the hash-value space and the fullness of the actual
table are two different things in PG.

> With the signature method, you can eliminate over
> 99.9% of these disk accesses -- you only have to access the data when you
> actually want to read or update it.  The hash table can usually fit easily
> in memory even for large tables, so it is blazingly fast.
>
> Furthermore, performance degrades gracefully as the hash table becomes
> overloaded.  Since each signature has 9 bits set, you can typically have
> 5-10 hash collisions (a lot of signatures ORed together in each record's
> DBsig) before the false-positive rate of the signature test gets too high.

But why not just distribute those 32/(5 to 10) bits to the ordinary
hash space, increasing them from 32 to 35 bits, rather than creating a
separate hash space?  Doesn't that get you the same resolving power?

Cheers,

Jeff