Thread: on hash indexes
Hi, indices.sgml contains this paragraph about hash indexes: Note: Testing has shown PostgreSQL's hash indexes to perform no better than B-tree indexes, and the index size and build time for hash indexes is much worse. Furthermore, hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. For these reasons, hash index use is presently discouraged. However, it seems to me that hash indexes are much improved in 8.4, so maybe this needs to be reworded. I'm not sure to what point they have been improved though. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
I had submitted the documentation change as part of my hash function patch but it was removed as not relevant. (It wasn't really.) I would basically remove the first sentence: Note: Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX aftera database crash. For this reason, hash index use is presently discouraged. Ken On Wed, Feb 04, 2009 at 01:22:23PM -0300, Alvaro Herrera wrote: > Hi, > > indices.sgml contains this paragraph about hash indexes: > > Note: Testing has shown PostgreSQL's hash indexes to perform no > better than B-tree indexes, and the index size and build time for hash > indexes is much worse. Furthermore, hash index operations are not > presently WAL-logged, so hash indexes might need to be rebuilt with > REINDEX after a database crash. For these reasons, hash index use is > presently discouraged. > > > However, it seems to me that hash indexes are much improved in 8.4, so > maybe this needs to be reworded. I'm not sure to what point they have > been improved though. > > -- > Alvaro Herrera http://www.CommandPrompt.com/ > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers >
The main speed improvement is for varchar datatype. I think It should be mention here as well. IIRC, times are similar with B-Tree for integer datatype. Zdenek Kenneth Marshall píše v st 04. 02. 2009 v 13:57 -0600: > I had submitted the documentation change as part of my > hash function patch but it was removed as not relevant. > (It wasn't really.) I would basically remove the first > sentence: > > Note: Hash index operations are not presently WAL-logged, > so hash indexes might need to be rebuilt with REINDEX after a > database crash. For this reason, hash index use is presently > discouraged. > > Ken > > > On Wed, Feb 04, 2009 at 01:22:23PM -0300, Alvaro Herrera wrote: > > Hi, > > > > indices.sgml contains this paragraph about hash indexes: > > > > Note: Testing has shown PostgreSQL's hash indexes to perform no > > better than B-tree indexes, and the index size and build time for hash > > indexes is much worse. Furthermore, hash index operations are not > > presently WAL-logged, so hash indexes might need to be rebuilt with > > REINDEX after a database crash. For these reasons, hash index use is > > presently discouraged. > > > > > > However, it seems to me that hash indexes are much improved in 8.4, so > > maybe this needs to be reworded. I'm not sure to what point they have > > been improved though. > > > > -- > > Alvaro Herrera http://www.CommandPrompt.com/ > > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > > > > -- > > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-hackers > > >
On Wed, Feb 04, 2009 at 11:22:44PM +0100, Zdenek Kotala wrote: > The main speed improvement is for varchar datatype. I think It should be > mention here as well. IIRC, times are similar with B-Tree for integer > datatype. > > Zdenek > > Kenneth Marshall p????e v st 04. 02. 2009 v 13:57 -0600: > > I had submitted the documentation change as part of my > > hash function patch but it was removed as not relevant. > > (It wasn't really.) I would basically remove the first > > sentence: > > > > Note: Hash index operations are not presently WAL-logged, > > so hash indexes might need to be rebuilt with REINDEX after a > > database crash. For this reason, hash index use is presently > > discouraged. > > > > Ken > > > > > > On Wed, Feb 04, 2009 at 01:22:23PM -0300, Alvaro Herrera wrote: > > > Hi, > > > > > > indices.sgml contains this paragraph about hash indexes: > > > > > > Note: Testing has shown PostgreSQL's hash indexes to perform no > > > better than B-tree indexes, and the index size and build time for hash > > > indexes is much worse. Furthermore, hash index operations are not > > > presently WAL-logged, so hash indexes might need to be rebuilt with > > > REINDEX after a database crash. For these reasons, hash index use is > > > presently discouraged. > > > > > > > > > However, it seems to me that hash indexes are much improved in 8.4, so > > > maybe this needs to be reworded. I'm not sure to what point they have > > > been improved though. > > > > > > -- > > > Alvaro Herrera http://www.CommandPrompt.com/ > > > PostgreSQL Replication, Consulting, Custom Development, 24x7 support > > > The speed improvement applies particularly to any datatype that is larger than an integer (typically 4 bytes). Also the size of fields that can be indexed efficiently is much, much larger than the 2K for Btree. And even 32-bit quantities may be indexed more efficiently than Btrees for large indexes due to the O(1) probe behavior. Btrees typically need to cache/probe the upper levels of the tree to locate the tuple. I have held off on extensive benchmarking until WAL has been implemented. Regards, Ken
Kenneth Marshall wrote: > I had submitted the documentation change as part of my > hash function patch but it was removed as not relevant. > (It wasn't really.) I would basically remove the first > sentence: > > Note: Hash index operations are not presently WAL-logged, > so hash indexes might need to be rebuilt with REINDEX after a > database crash. For this reason, hash index use is presently > discouraged. Change made and attached; thanks. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/indices.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v retrieving revision 1.75 diff -c -c -r1.75 indices.sgml *** doc/src/sgml/indices.sgml 23 Sep 2008 09:20:34 -0000 1.75 --- doc/src/sgml/indices.sgml 7 Feb 2009 20:03:51 -0000 *************** *** 190,202 **** <note> <para> ! Testing has shown <productname>PostgreSQL</productname>'s hash ! indexes to perform no better than B-tree indexes, and the ! index size and build time for hash indexes is much worse. ! Furthermore, hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with <command>REINDEX</> after a database crash. ! For these reasons, hash index use is presently discouraged. </para> </note> --- 190,199 ---- <note> <para> ! Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with <command>REINDEX</> after a database crash. ! For this reason, hash index use is presently discouraged. </para> </note>