Thread: Hash index

Hash index

From
"RAJU kumar"
Date:

 
 
i want to find out the difference between the btree index and hash index and how exactly the hash index work.


Re: Hash index

From
Bruno Wolff III
Date:
On Fri, Aug 26, 2005 at 10:59:50 -0000,
  RAJU  kumar <raju_19db@rediffmail.com> wrote:
>   
>   
> i want to find out the difference between the btree index and hash index and how exactly the hash index work.

Why? You probably don't want to use hash indexes as btrees are virtually
always better in the current implementation.

Re: Hash index

From
Bruno Wolff III
Date:
Please keep replies posted to the list so that more people can add to and
learn from the discussion.

On Sat, Aug 27, 2005 at 08:18:57 -0000,
  RAJU  kumar <raju_19db@rediffmail.com> wrote:
> Sir i am a consultant in Sql * international india,
> And the problem with me is i have to handle all the question regarding oracle i now how btree works and also about
bitmapbut i want to know abt hash index and how and where we should use it. 
> Hope u will help me .
> raju

The short answer is that you don't want to use hash indexes in postgres.

Re: Hash index

From
"Jim C. Nasby"
Date:
On Sat, Aug 27, 2005 at 01:21:50AM -0500, Bruno Wolff III wrote:
> On Fri, Aug 26, 2005 at 10:59:50 -0000,
>   RAJU  kumar <raju_19db@rediffmail.com> wrote:
> >  ?
> >  ?
> > i want to find out the difference between the btree index and hash index and how exactly the hash index work.
>
> Why? You probably don't want to use hash indexes as btrees are virtually
> always better in the current implementation.

That may be true but it's still good to be able to see what they
actually do and how they work. In particular it would be good to know
what relation they have to the hash operations (HashAgg, HashJoin, etc).
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software        http://pervasive.com        512-569-9461

Re: Hash index

From
Scott Marlowe
Date:
On Fri, 2005-08-26 at 05:59, RAJU kumar wrote:
>
>
> i want to find out the difference between the btree index and hash
> index and how exactly the hash index work.

How do hash indexes work in postgresql?  Poorly.  badumpbump.  Thanks,
I'll be here all week (bad western humor, sorry.)

Seriously, there are issues that pop up here every so often that stump
people until we find out that they're using hash indexes and every goes
"Ohhhhh!  That's why X isn't working for you."

Currently, only btree indexes are handled properly in case of a server
crash (other indexes may get corrupted silently) and they are generally
as fast as or faster than hashes.

The basic concept in postgresql is the same as elsewhere, but they've
never been fleshed out properly, especially compared to btree, which
have had lots of testing and fixing and performance tuning and such.

Personally, I think that when one creates a non-btree index, one should
get a warning saying that non-btree indexes are not necessarily
transactionally safe in the event of a crash.

Re: Hash index

From
Tom Lane
Date:
Scott Marlowe <smarlowe@g2switchworks.com> writes:
> Personally, I think that when one creates a non-btree index, one should
> get a warning saying that non-btree indexes are not necessarily
> transactionally safe in the event of a crash.

As of 8.1, GIST indexes are WAL-logged, so that would be inappropriate
anyway.

What I foresee happening (soon, 8.2 maybe) is that rtree will go away
completely; there isn't anything it can do that GIST can't do as well or
better, so there's no point in expending maintenance resources on it.
We're really only leaving it there for 8.1 because there's so much new,
as-yet-poorly-tested code involved in the GIST replacement opclasses.

That leaves hash.  I'm hoping someone will step up and do WAL logging
for hash in the near future.  Unlike rtree, I'm not expecting that we
might get rid of hash indexes.  Even if the performance problems never
get fixed, we use hash index opclasses to manage datatype-specific
hashing for hash joins, hash aggregation, etc, so if we removed hash
indexes we'd need to find some other representation for all that.

Anyway, that's my two cents on the development roadmap for indexes.

            regards, tom lane

Re: Hash index

From
"Jim C. Nasby"
Date:
On Tue, Aug 30, 2005 at 03:32:26PM -0400, Tom Lane wrote:
> That leaves hash.  I'm hoping someone will step up and do WAL logging
> for hash in the near future.  Unlike rtree, I'm not expecting that we
> might get rid of hash indexes.  Even if the performance problems never
> get fixed, we use hash index opclasses to manage datatype-specific
> hashing for hash joins, hash aggregation, etc, so if we removed hash
> indexes we'd need to find some other representation for all that.

So does that mean a hash index could (theoretically) improve the
performance of a hash join or hash aggregation?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software        http://pervasive.com        512-569-9461

Re: Hash index

From
Tom Lane
Date:
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> So does that mean a hash index could (theoretically) improve the
> performance of a hash join or hash aggregation?

I don't think so --- the whole point of a hash is to do your matching
in-memory, rather than jumping all over the disk for it ...

            regards, tom lane

Re: Hash index

From
Scott Marlowe
Date:
On Tue, 2005-08-30 at 14:32, Tom Lane wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
> > Personally, I think that when one creates a non-btree index, one should
> > get a warning saying that non-btree indexes are not necessarily
> > transactionally safe in the event of a crash.
>
> As of 8.1, GIST indexes are WAL-logged, so that would be inappropriate
> anyway.

Only in this exact instance.  It still might make sense to emit such a
warning / notice when using a non Wal logged index of any kind though.

Glad that GiST indexes got the WAL logging, btw, my tanks to whoever did
the work.

Re: Hash index

From
"Jim C. Nasby"
Date:
On Tue, Aug 30, 2005 at 04:20:03PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
> > So does that mean a hash index could (theoretically) improve the
> > performance of a hash join or hash aggregation?
>
> I don't think so --- the whole point of a hash is to do your matching
> in-memory, rather than jumping all over the disk for it ...

True, but if you have an index on the value you're hashing you should be
able to find out very quickly if it exists or not, no? So if you were
doing a hash join between two tables and one of them had a hash index on
the appropriate field, couldn't you check the index first to see if
there was a match? Granted, you could probably do the same thing with a
B-tree index, but this might be faster in some scenarios.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software        http://pervasive.com        512-569-9461