Re: Hash indexes (was: On-disk bitmap index patch) - Mailing list pgsql-hackers

From Kenneth Marshall
Subject Re: Hash indexes (was: On-disk bitmap index patch)
Date
Msg-id 20060728174412.GI25385@it.is.rice.edu
Whole thread Raw
In response to Re: Hash indexes (was: On-disk bitmap index patch)  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-hackers
On Fri, Jul 28, 2006 at 12:14:49PM -0500, Jim C. Nasby wrote:
> On Thu, Jul 27, 2006 at 01:46:01PM -0400, Alvaro Herrera wrote:
> > Jim Nasby wrote:
> > > On Jul 25, 2006, at 3:31 PM, Tom Lane wrote:
> > > >Hannu Krosing <hannu@skype.net> writes:
> > 
> > > >>What would be the use-case for hash indexes ? And what should be
> > > >>done to make them faster than btree ?
> > > >
> > > >If we knew, we'd do it ;-)  But no one's put enough effort into it
> > > >to find out.
> > > 
> > > Do they use the same hash algorithm as hash joins/aggregation? If so,  
> > > wouldn't hash indexes be faster for those operations than regular  
> > > indexes?
> > 
> > The main problem doesn't seem to be in the hash algorithm (which I
> > understand to mean the hashing function), but in the protocol for
> > concurrent access of index pages, and the distribution of keys in pages
> > of a single hash key.
> > 
> > This is described in a README file or a code comment somewhere in the
> > hash AM code.  Someone needs to do some profiling to find out what the
> > bottleneck really is, and ideally find a way to fix it.
> 
> What I'm getting at is that I've never seen any explanation for the
> theoretical use cases where a hash index would outperform a btree. If we
> knew what kind of problems hash indexes were supposed to solve, we could
> try and interest people who are solving those kinds of problems in
> fixing hash indexes.

The big win for hash indexes is the idea that searching for a single
value should only take 1 I/O operation in a perfect world. Btree can
not do that.

Ken


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 64-bit integers for GUC
Next
From: David Fetter
Date:
Subject: Re: Let psql process files with > 4,294,967,295 lines