Re: Need more reviewers! - Mailing list pgsql-hackers

From Kenneth Marshall
Subject Re: Need more reviewers!
Date
Msg-id 20080904184817.GG16619@it.is.rice.edu
Whole thread Raw
In response to Re: Need more reviewers!  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Need more reviewers!  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Sep 04, 2008 at 02:01:18PM -0400, Tom Lane wrote:
> "Jonah H. Harris" <jonah.harris@gmail.com> writes:
> > I'll push forward on reviewing and testing Xiao's hash index
> > improvements for inclusion into core.  Though, someone will still need
> > to review my stuff.
> 
> I think what the hash index patch really needs is some performance
> testing.  I'm willing to take responsibility for the code being okay
> or not, but I haven't got any production-grade hardware to do realistic
> performance tests on.  I'd like to see a few more scenarios tested than
> the one provided so far: in particular, wide vs narrow index keys and
> good vs bad key distributions.
> 
Tom,

Do you mean good vs. bad key distributions with respect to hash value
collisions? 

> It strikes me that there's an intermediate posture that the patch
> doesn't provide: namely, store *both* hash code and original index key
> in each index entry.  This would make the index larger rather than
> smaller, but you could still do the intra-page sort by hash code,
> which I think is likely a big win.  In exchange for the bigger index
> you'd avoid fruitless trips to the heap for chance hashcode matches.
> So it seems far from clear to me that the hash-code-only solution is
> necessarily the best.
> 
Currently, since a trip to the heap is required to validate any tuple
even if the exact value is contained in the index, it does not seem
like it would be a win to store the value in both places. One of the
big improvements is the space efficiency of the index obtained by
just storing the hash value. I think that increasing the number of
hash bits stored would provide more bang-for-the-buck than storing
the exact value. One easy way to provide this functionality without
increasing the storage requirements is to take advantage of the
knowledge of the bucket number to increase the number of bit, i.e.
at 256 buckets, remove the first 8-bits of the hash and add 8-bits
to provide a 40-bit hash in the same storage. At 64k buckets, you
can now store a 48-bit hash in the same space and at 16m buckets, you
can store a 56-bit hash in the original 32-bit space. So as the number
of buckets increases, the number of hash bits increases as well as the
specificity of the resulting hash thereby reducing the need to visit
the heap tuple store.

Another idea, takes advantage of smaller "bucket" size (I think of
them as sub-buckets) to add an additional 8-bits of hash value at
the 32m or 64m buckets by looking up the hash in one of 2^n sub-buckets
where n = 64 or 128 or even 256. This has the advantage of eliminating
the binary lookup and insertion within the bucket page. Again, this
will need to be tested.

> If anyone is willing to do comparative performance testing, I'll
> volunteer to make up two variant patches that do it both ways and
> are otherwise equivalent.
> 
I am in the boat with you, in that I do not have database systems
with enough hardware to perform realistic testing.

> (I wouldn't be too surprised if testing shows that each way could be
> a win depending on the situation.  In that case we could think about
> how to provide a switch to let users pick the method.  But for the
> moment let's just test two separate patches.)
I think, as of yet un-tested, that where storing both would be of
value when we do not have to visit the heap for visibility information
or when using the space to store more hash bits would be more effective.

Cheers,
Ken

>             regards, tom lane
> 


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Window functions patch v04 for the September commit fest
Next
From: Tom Lane
Date:
Subject: Re: Need more reviewers!