Re: updated hash functions for postgresql v1 - Mailing list pgsql-patches
From | Kenneth Marshall |
---|---|
Subject | Re: updated hash functions for postgresql v1 |
Date | |
Msg-id | 20080405214050.GA4802@it.is.rice.edu Whole thread Raw |
In response to | Re: updated hash functions for postgresql v1 (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: updated hash functions for postgresql v1
|
List | pgsql-patches |
On Sat, Apr 05, 2008 at 03:40:35PM -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Sun, 2007-10-28 at 13:05 -0500, Kenneth Marshall wrote: > >> The new hash function is roughly twice as fast as the old function in > >> terms of straight CPU time. It uses the same design as the current > >> hash but provides code paths for aligned and unaligned access as well > >> as separate mixing functions for different blocks in the hash run > >> instead of having one general purpose block. I think the speed will > >> not be an obvious win with smaller items, but will be very important > >> when hashing larger items (up to 32kb). > >> > >> Better in this case means that the new hash mixes more thoroughly > >> which results in less collisions and more even bucket distribution. > >> There is also a 64-bit varient which is still faster since it can > >> take advantage of the 64-bit processor instruction set. > > > Ken, I was really looking for some tests that show both of the above > > were true. We've had some trouble proving the claims of other algorithms > > before, so I'm less inclined to take those things at face value. > > I spent some time today looking at this code more closely and running > some simple speed tests. It is faster than what we have, although 2X > is the upper limit of the speedups I saw on four different machines. > There are several things going on in comparison to our existing > hash_any: > > * If the source data is word-aligned, the new code fetches it a word at > a time instead of a byte at a time; that is > > a += (k[0] + ((uint32) k[1] << 8) + ((uint32) k[2] << 16) + ((uint32) k[3] << 24)); > b += (k[4] + ((uint32) k[5] << 8) + ((uint32) k[6] << 16) + ((uint32) k[7] << 24)); > c += (k[8] + ((uint32) k[9] << 8) + ((uint32) k[10] << 16) + ((uint32) k[11] << 24)); > > becomes > > a += k[0]; > b += k[1]; > c += k[2]; > > where k is now pointer to uint32 instead of uchar. This accounts for > most of the speed improvement. However, the results now vary between > big-endian and little-endian machines. That's fine for PG's purposes. > But it means that we need two sets of code for the unaligned-input code > path, since it clearly won't do for the same bytestring to get two > different hashes depending on whether it happens to be presented aligned > or not. The presented patch actually offers *four* code paths, so that > you can compute either little-endian-ish or big-endian-ish hashes on > either type of machine. That's nothing but bloat for our purposes, and > should be reduced to the minimum. > I agree that a good portion of the speed up is due to the full word processing. The original code from Bob Jenkins had all of these code paths and I just dropped them in with a minimum of changes. > * Given a word-aligned source pointer and a length that isn't a multiple > of 4, the new code fetches the last partial word as a full word fetch > and masks it off, as per the code comment: > > * "k[2]&0xffffff" actually reads beyond the end of the string, but > * then masks off the part it's not allowed to read. Because the > * string is aligned, the masked-off tail is in the same word as the > * rest of the string. Every machine with memory protection I've seen > * does it on word boundaries, so is OK with this. But VALGRIND will > * still catch it and complain. The masking trick does make the hash > * noticably faster for short strings (like English words). > > This I think is well beyond the bounds of sanity, especially since we > have no configure support for setting #ifdef VALGRIND. I'd lose the > "non valgrind clean" paths (which again are contributing to the patch's > impression of bloat/redundancy). > Okay, I will strip the VALGRIND paths. I did not see a real need for them either. > * Independently of the above changes, the actual hash calculation > (the mix() and final() macros) has been changed. Ken claims that > this made the hash "better", but I'm deeply suspicious of that. > The comments in the code make it look like Jenkins actually sacrificed > hash quality in order to get a little more speed. I don't think we > should adopt those changes unless some actual evidence is presented > that the hash is better and not merely faster. > I was repeating the claims made by the functions author after his own testing. His analysis and tests were reasonable, but I do agree that we need some testing of our own. I will start pulling some test cases together like what was discussed earlier with Simon. > > In short: I think we should adopt the changes to use aligned word > fetches where possible, but not adopt the mix/final changes unless > more evidence is presented. > Okay, I agree and will work on producing evidence either way. > Lastly, the patch adds yet more code to provide the option of computing > a 64-bit hash rather than 32. (AFAICS, the claim that this part is > optimized for 64-bit machines is mere fantasy. It's simply Yet Another > duplicate of the identical code, but it gives you back two of its three > words of internal state at the end, instead of only one.) As I said > before, this is just bloat for us. I've got zero interest in pursuing > 64-bit hashing when we still don't have a hash index implementation that > anyone would consider using in anger. Let's see if we can make the cake > edible before worrying about putting a better grade of icing on it. > You are correct, my 64-bit claim was due to mis-interpreting some comments by the author. He sent in a correction to the mailing list, personally. Regards, Ken Marshall
pgsql-patches by date: