Re: updated hash functions for postgresql v1 - Mailing list pgsql-patches

From Tom Lane
Subject Re: updated hash functions for postgresql v1
Date
Msg-id 23832.1207424435@sss.pgh.pa.us
Whole thread Raw
In response to Re: updated hash functions for postgresql v1  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: updated hash functions for postgresql v1  (Kenneth Marshall <ktm@rice.edu>)
List pgsql-patches
Simon Riggs <simon@2ndquadrant.com> writes:
> On Sun, 2007-10-28 at 13:05 -0500, Kenneth Marshall wrote:
>> The new hash function is roughly twice as fast as the old function in
>> terms of straight CPU time. It uses the same design as the current
>> hash but provides code paths for aligned and unaligned access as well
>> as separate mixing functions for different blocks in the hash run
>> instead of having one general purpose block. I think the speed will
>> not be an obvious win with smaller items, but will be very important
>> when hashing larger items (up to 32kb).
>>
>> Better in this case means that the new hash mixes more thoroughly
>> which results in less collisions and more even bucket distribution.
>> There is also a 64-bit varient which is still faster since it can
>> take advantage of the 64-bit processor instruction set.

> Ken, I was really looking for some tests that show both of the above
> were true. We've had some trouble proving the claims of other algorithms
> before, so I'm less inclined to take those things at face value.

I spent some time today looking at this code more closely and running
some simple speed tests.  It is faster than what we have, although 2X
is the upper limit of the speedups I saw on four different machines.
There are several things going on in comparison to our existing
hash_any:

* If the source data is word-aligned, the new code fetches it a word at
a time instead of a byte at a time; that is

        a += (k[0] + ((uint32) k[1] << 8) + ((uint32) k[2] << 16) + ((uint32) k[3] << 24));
        b += (k[4] + ((uint32) k[5] << 8) + ((uint32) k[6] << 16) + ((uint32) k[7] << 24));
        c += (k[8] + ((uint32) k[9] << 8) + ((uint32) k[10] << 16) + ((uint32) k[11] << 24));

becomes

        a += k[0];
        b += k[1];
        c += k[2];

where k is now pointer to uint32 instead of uchar.  This accounts for
most of the speed improvement.  However, the results now vary between
big-endian and little-endian machines.  That's fine for PG's purposes.
But it means that we need two sets of code for the unaligned-input code
path, since it clearly won't do for the same bytestring to get two
different hashes depending on whether it happens to be presented aligned
or not.  The presented patch actually offers *four* code paths, so that
you can compute either little-endian-ish or big-endian-ish hashes on
either type of machine.  That's nothing but bloat for our purposes, and
should be reduced to the minimum.

* Given a word-aligned source pointer and a length that isn't a multiple
of 4, the new code fetches the last partial word as a full word fetch
and masks it off, as per the code comment:

     * "k[2]&0xffffff" actually reads beyond the end of the string, but
     * then masks off the part it's not allowed to read.  Because the
     * string is aligned, the masked-off tail is in the same word as the
     * rest of the string.  Every machine with memory protection I've seen
     * does it on word boundaries, so is OK with this.  But VALGRIND will
     * still catch it and complain.  The masking trick does make the hash
     * noticably faster for short strings (like English words).

This I think is well beyond the bounds of sanity, especially since we
have no configure support for setting #ifdef VALGRIND.  I'd lose the
"non valgrind clean" paths (which again are contributing to the patch's
impression of bloat/redundancy).

* Independently of the above changes, the actual hash calculation
(the mix() and final() macros) has been changed.  Ken claims that
this made the hash "better", but I'm deeply suspicious of that.
The comments in the code make it look like Jenkins actually sacrificed
hash quality in order to get a little more speed.  I don't think we
should adopt those changes unless some actual evidence is presented
that the hash is better and not merely faster.


In short: I think we should adopt the changes to use aligned word
fetches where possible, but not adopt the mix/final changes unless
more evidence is presented.

Lastly, the patch adds yet more code to provide the option of computing
a 64-bit hash rather than 32.  (AFAICS, the claim that this part is
optimized for 64-bit machines is mere fantasy.  It's simply Yet Another
duplicate of the identical code, but it gives you back two of its three
words of internal state at the end, instead of only one.)  As I said
before, this is just bloat for us.  I've got zero interest in pursuing
64-bit hashing when we still don't have a hash index implementation that
anyone would consider using in anger.  Let's see if we can make the cake
edible before worrying about putting a better grade of icing on it.

            regards, tom lane

pgsql-patches by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Partial match in GIN
Next
From: Kenneth Marshall
Date:
Subject: Re: updated hash functions for postgresql v1