Thread: Hash index
On Fri, Aug 26, 2005 at 10:59:50 -0000, RAJU kumar <raju_19db@rediffmail.com> wrote: > > > i want to find out the difference between the btree index and hash index and how exactly the hash index work. Why? You probably don't want to use hash indexes as btrees are virtually always better in the current implementation.
Please keep replies posted to the list so that more people can add to and learn from the discussion. On Sat, Aug 27, 2005 at 08:18:57 -0000, RAJU kumar <raju_19db@rediffmail.com> wrote: > Sir i am a consultant in Sql * international india, > And the problem with me is i have to handle all the question regarding oracle i now how btree works and also about bitmapbut i want to know abt hash index and how and where we should use it. > Hope u will help me . > raju The short answer is that you don't want to use hash indexes in postgres.
On Sat, Aug 27, 2005 at 01:21:50AM -0500, Bruno Wolff III wrote: > On Fri, Aug 26, 2005 at 10:59:50 -0000, > RAJU kumar <raju_19db@rediffmail.com> wrote: > > ? > > ? > > i want to find out the difference between the btree index and hash index and how exactly the hash index work. > > Why? You probably don't want to use hash indexes as btrees are virtually > always better in the current implementation. That may be true but it's still good to be able to see what they actually do and how they work. In particular it would be good to know what relation they have to the hash operations (HashAgg, HashJoin, etc). -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com 512-569-9461
On Fri, 2005-08-26 at 05:59, RAJU kumar wrote: > > > i want to find out the difference between the btree index and hash > index and how exactly the hash index work. How do hash indexes work in postgresql? Poorly. badumpbump. Thanks, I'll be here all week (bad western humor, sorry.) Seriously, there are issues that pop up here every so often that stump people until we find out that they're using hash indexes and every goes "Ohhhhh! That's why X isn't working for you." Currently, only btree indexes are handled properly in case of a server crash (other indexes may get corrupted silently) and they are generally as fast as or faster than hashes. The basic concept in postgresql is the same as elsewhere, but they've never been fleshed out properly, especially compared to btree, which have had lots of testing and fixing and performance tuning and such. Personally, I think that when one creates a non-btree index, one should get a warning saying that non-btree indexes are not necessarily transactionally safe in the event of a crash.
Scott Marlowe <smarlowe@g2switchworks.com> writes: > Personally, I think that when one creates a non-btree index, one should > get a warning saying that non-btree indexes are not necessarily > transactionally safe in the event of a crash. As of 8.1, GIST indexes are WAL-logged, so that would be inappropriate anyway. What I foresee happening (soon, 8.2 maybe) is that rtree will go away completely; there isn't anything it can do that GIST can't do as well or better, so there's no point in expending maintenance resources on it. We're really only leaving it there for 8.1 because there's so much new, as-yet-poorly-tested code involved in the GIST replacement opclasses. That leaves hash. I'm hoping someone will step up and do WAL logging for hash in the near future. Unlike rtree, I'm not expecting that we might get rid of hash indexes. Even if the performance problems never get fixed, we use hash index opclasses to manage datatype-specific hashing for hash joins, hash aggregation, etc, so if we removed hash indexes we'd need to find some other representation for all that. Anyway, that's my two cents on the development roadmap for indexes. regards, tom lane
On Tue, Aug 30, 2005 at 03:32:26PM -0400, Tom Lane wrote: > That leaves hash. I'm hoping someone will step up and do WAL logging > for hash in the near future. Unlike rtree, I'm not expecting that we > might get rid of hash indexes. Even if the performance problems never > get fixed, we use hash index opclasses to manage datatype-specific > hashing for hash joins, hash aggregation, etc, so if we removed hash > indexes we'd need to find some other representation for all that. So does that mean a hash index could (theoretically) improve the performance of a hash join or hash aggregation? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com 512-569-9461
"Jim C. Nasby" <jnasby@pervasive.com> writes: > So does that mean a hash index could (theoretically) improve the > performance of a hash join or hash aggregation? I don't think so --- the whole point of a hash is to do your matching in-memory, rather than jumping all over the disk for it ... regards, tom lane
On Tue, 2005-08-30 at 14:32, Tom Lane wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > Personally, I think that when one creates a non-btree index, one should > > get a warning saying that non-btree indexes are not necessarily > > transactionally safe in the event of a crash. > > As of 8.1, GIST indexes are WAL-logged, so that would be inappropriate > anyway. Only in this exact instance. It still might make sense to emit such a warning / notice when using a non Wal logged index of any kind though. Glad that GiST indexes got the WAL logging, btw, my tanks to whoever did the work.
On Tue, Aug 30, 2005 at 04:20:03PM -0400, Tom Lane wrote: > "Jim C. Nasby" <jnasby@pervasive.com> writes: > > So does that mean a hash index could (theoretically) improve the > > performance of a hash join or hash aggregation? > > I don't think so --- the whole point of a hash is to do your matching > in-memory, rather than jumping all over the disk for it ... True, but if you have an index on the value you're hashing you should be able to find out very quickly if it exists or not, no? So if you were doing a hash join between two tables and one of them had a hash index on the appropriate field, couldn't you check the index first to see if there was a match? Granted, you could probably do the same thing with a B-tree index, but this might be faster in some scenarios. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com 512-569-9461