Re: HASH index method not correctly handling NULL text - Mailing list pgsql-general

From Neil Conway
Subject Re: HASH index method not correctly handling NULL text
Date
Msg-id 1012696685.14533.36.camel@jiro
Whole thread Raw
In response to HASH index method not correctly handling NULL text values?  (David Madore <david.madore@ens.fr>)
List pgsql-general
On Sat, 2002-02-02 at 18:49, David Madore wrote:
> Hello.
>
> I don't know if this counts as a bug or not.  I'm using PostgreSQL
> version 7.1.3, and I tried creating an index using HASH (rather than
> BTREE as I usually do) on a text column of one of my tables.  The
> table has 345442 rows, of which 344339 have a non NULL value in the
> column in question.  The index creation proceeded without trouble, but
> a subsequent "VACUUM VERBOSE ANALYZE" command gave the following
> warning: "NUMBER OF INDEX' TUPLES (344339) IS NOT THE SAME AS HEAP'
> (345442)" with the advice to recreate the index.  Essentially, my
> question is: is the warning incorrect or is the index so?  In the end
> I recreated my index as a BTREE and this gave no problem.

The hash index type has some known problems. For one thing, it has a
tendancy to deadlock under heavy load -- on my machine, 'pgbench -c 10
-t 100' will reproducibly produce deadlocks and failed queries. I would
not advise the use of hash indexes for production machines.

However, I've been starting to do some Pg hacking. Coincidentally, my
first project is to attempt to improve hash indexes. I'll see if I can
reproduce your problem on my local machine, and I'll let you know
if/when I have a fix.

> Incidentally, I might ask, which is the best choice of indexing method
> considering that the only comparison I will ever make on this column
> is equality (this is what led me to prefer HASH initially) and
> considering that the column takes of the order of magnitude of a few
> hundred distinct values (as opposed to the third-of-a-million rows of
> the table)?  The BTREE index creation was considerably faster than the
> HASH index creation.

Currently, btree is better than hash for almost any conceivable use. In
theory, I believe hash could provide slightly faster lookups than btree
(can someone confirm this?), but given the issues with hash at the
moment, it's not worth it.

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC


pgsql-general by date:

Previous
From: David Madore
Date:
Subject: HASH index method not correctly handling NULL text values?
Next
From: Tom Lane
Date:
Subject: Re: HASH index method not correctly handling NULL text values?