On Fri, Jul 28, 2006 at 12:14:49PM -0500, Jim C. Nasby wrote:
> On Thu, Jul 27, 2006 at 01:46:01PM -0400, Alvaro Herrera wrote:
> > Jim Nasby wrote:
> > > On Jul 25, 2006, at 3:31 PM, Tom Lane wrote:
> > > >Hannu Krosing <hannu@skype.net> writes:
> >
> > > >>What would be the use-case for hash indexes ? And what should be
> > > >>done to make them faster than btree ?
> > > >
> > > >If we knew, we'd do it ;-) But no one's put enough effort into it
> > > >to find out.
> > >
> > > Do they use the same hash algorithm as hash joins/aggregation? If so,
> > > wouldn't hash indexes be faster for those operations than regular
> > > indexes?
> >
> > The main problem doesn't seem to be in the hash algorithm (which I
> > understand to mean the hashing function), but in the protocol for
> > concurrent access of index pages, and the distribution of keys in pages
> > of a single hash key.
> >
> > This is described in a README file or a code comment somewhere in the
> > hash AM code. Someone needs to do some profiling to find out what the
> > bottleneck really is, and ideally find a way to fix it.
>
> What I'm getting at is that I've never seen any explanation for the
> theoretical use cases where a hash index would outperform a btree. If we
> knew what kind of problems hash indexes were supposed to solve, we could
> try and interest people who are solving those kinds of problems in
> fixing hash indexes.
The big win for hash indexes is the idea that searching for a single
value should only take 1 I/O operation in a perfect world. Btree can
not do that.
Ken