Thread: Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
Re: Hash index use presently(?) discouraged since 2005: revive or bury it?
From
"Kevin Grittner"
Date:
Stefan Keller wrote: > It's hard for me to imagine that btree is superior for all the > issues mentioned before. It would be great if you could show a benchmark technique which shows otherwise. -Kevin
I'm simply referring to literature (like the intro Ramakrishnan & Gehrke). I just know that Oracle an Mysql actually do have them too and use it without those current implementation specific restrictions in Postgres. IMHO by design Hash Index (e.g. linear hashing) work best when: 1. only equal (=) tests are used (on whole values) 2. columns (key values) have very-high cardinality And ideally but not necessarily when index values do not change and number of rows are known ahead of time (avoiding O(N) worst case - but there are approaches to chaining with dynamic resizing). I just collected this to encourage ourselves that enhancing hash indexes could be worthwhile. Stefan 2011/9/18 Kevin Grittner <Kevin.Grittner@wicourts.gov>: > Stefan Keller wrote: > >> It's hard for me to imagine that btree is superior for all the >> issues mentioned before. > > It would be great if you could show a benchmark technique which shows > otherwise. > > -Kevin >
Regarding the recent discussion about hash versus B-trees: Here is a trick I invented years ago to make hash indexes REALLYfast. It eliminates the need for a second disk access to check the data in almost all cases, at the cost of an additional32-bit integer in the hash-table data structure. Using this technique, we were able to load a hash-indexed database with data transfer rates that matched a cp (copy) commandof the same data on Solaris, HP-UX and IBM AIX systems. You build a normal hash table with hash-collision chains. But you add another 32-bit integer "signature" field to the hash-collisionrecord (call it "DBsig"). You also create a function: signature = sig(key) that produces digital signature. The critical factor in the sig() function is that there is an average of 9 bits set (i.e.it is somewhat "sparse" on bits). DBsig for a hash-collision chain is always the bitwise OR of every record in that hash-collision chain. When you add a recordto the hash table, you do a bitwise OR of its signature into the existing DBsig. If you delete a record, you eraseDBsig and rebuild it by recomputing the signatures of each record in the hash-collision chain and ORing them togetheragain. That means that for any key K, if K is actually on the disk, then all of the bits of sig(K) are always set in the hash-tablerecord's "DBsig". If any one bit in sig(K) isn't set in "DBsig", then K is not in the database and you don't haveto do a disk access to verify it. More formally, if sig(K) AND DBsig != sig(K) then K is definitely not in the database. A typical hash table implementation might operate with a hash table that's 50-90% full, which means that the majority ofaccesses to a hash index will return a record and require a disk access to check whether the key K is actually in the database. With the signature method, you can eliminate over 99.9% of these disk accesses -- you only have to access the datawhen you actually want to read or update it. The hash table can usually fit easily in memory even for large tables,so it is blazingly fast. Furthermore, performance degrades gracefully as the hash table becomes overloaded. Since each signature has 9 bits set,you can typically have 5-10 hash collisions (a lot of signatures ORed together in each record's DBsig) before the false-positiverate of the signature test gets too high. As the hash table gets overloaded and needs to be resized, the falsepositives increase gradually and performance decreases due to the resulting unnecessary disk fetches to check the key. In the worst case (e.g. a hash table that's overloaded by a factor of 10 or more), performance degrades to what it wouldbe without the signatures. For much higher selectivity, you could use a 64-bit signatures and make the sig() set an average of 13 bits. You'd get verygood selectivity even in a badly overloaded hash table (long hash-collision chains). This technique was never patented, and it was disclosed at several user-group meetings back in the early 1990's, so thereare no restrictions on its use. If this is of any use to anyone, maybe you could stick my name in the code somewhere. Craig James (the other Craig)
Hi, On 19 September 2011 11:14, Craig James <craig_james@emolecules.com> wrote: > DBsig for a hash-collision chain is always the bitwise OR of every record in > that hash-collision chain. When you add a record to the hash table, you do > a bitwise OR of its signature into the existing DBsig. If you delete a > record, you erase DBsig and rebuild it by recomputing the signatures of each > record in the hash-collision chain and ORing them together again. Sound like a Bloom filter [1] to me. BTW, Does Postgres use Bloom filter anywhere? [1] http://en.wikipedia.org/wiki/Bloom_filter -- Ondrej Ivanic (ondrej.ivanic@gmail.com)
2011/9/19 Ondrej Ivanič <ondrej.ivanic@gmail.com>: > BTW, Does Postgres use Bloom filter anywhere? I saw patches for at least in-memory bloom filters (for hash joins) Not sure they're committed. I think so.
On Sun, Sep 18, 2011 at 6:14 PM, Craig James <craig_james@emolecules.com> wrote: > Regarding the recent discussion about hash versus B-trees: Here is a trick I > invented years ago to make hash indexes REALLY fast. It eliminates the need > for a second disk access to check the data in almost all cases, at the cost > of an additional 32-bit integer in the hash-table data structure. I don't see how that can work unless almost all of your queries return zero rows. If it returns rows, you have to go get those rows in order to return them. And you also have to go get them to determine if they are currently visible to the current transaction. This sounds like a highly specialized data structure for a highly specialized situation. > Using this technique, we were able to load a hash-indexed database with data > transfer rates that matched a cp (copy) command of the same data on Solaris, > HP-UX and IBM AIX systems. > > You build a normal hash table with hash-collision chains. But you add > another 32-bit integer "signature" field to the hash-collision record (call > it "DBsig"). You also create a function: > > signature = sig(key) > > that produces digital signature. The critical factor in the sig() function > is that there is an average of 9 bits set (i.e. it is somewhat "sparse" on > bits). > > DBsig for a hash-collision chain is always the bitwise OR of every record in > that hash-collision chain. When you add a record to the hash table, you do > a bitwise OR of its signature into the existing DBsig. If you delete a > record, you erase DBsig and rebuild it by recomputing the signatures of each > record in the hash-collision chain and ORing them together again. Since PG doesn't store the keys in the hash index, this would mean visiting the table for every entry with the same hash code. > > That means that for any key K, if K is actually on the disk, then all of the > bits of sig(K) are always set in the hash-table record's "DBsig". If any > one bit in sig(K) isn't set in "DBsig", then K is not in the database and > you don't have to do a disk access to verify it. More formally, if > > sig(K) AND DBsig != sig(K) > > then K is definitely not in the database. > > A typical hash table implementation might operate with a hash table that's > 50-90% full, which means that the majority of accesses to a hash index will > return a record and require a disk access to check whether the key K is > actually in the database. PG hash indexes do not typically operate at anywhere near that full, because PG stores the entire 32 bit hash value. Even if there are only 8 buckets, and so only the bottom 3 bits are used to identify the bucket, once in the bucket all 32 bits are inspected for collisions. So on tables with less than several hundred million, collisions are rare except for identical keys or malicious keys. And if you want tables much larger than that, you should probably go whole hog and switch over to 64 bit. So the fullness of the hash-value space and the fullness of the actual table are two different things in PG. > With the signature method, you can eliminate over > 99.9% of these disk accesses -- you only have to access the data when you > actually want to read or update it. The hash table can usually fit easily > in memory even for large tables, so it is blazingly fast. > > Furthermore, performance degrades gracefully as the hash table becomes > overloaded. Since each signature has 9 bits set, you can typically have > 5-10 hash collisions (a lot of signatures ORed together in each record's > DBsig) before the false-positive rate of the signature test gets too high. But why not just distribute those 32/(5 to 10) bits to the ordinary hash space, increasing them from 32 to 35 bits, rather than creating a separate hash space? Doesn't that get you the same resolving power? Cheers, Jeff