Re: What is wrong with hashed index usage? - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: What is wrong with hashed index usage?
Date
Msg-id 200206211906.g5LJ6Mn08188@candle.pha.pa.us
Whole thread Raw
In response to Re: What is wrong with hashed index usage?  ("Dann Corbit" <DCorbit@connx.com>)
List pgsql-hackers
Dann Corbit wrote:
> > > This change strikes me as a step backwards.  The existing
> > wording tells
> > > the truth; the proposed revision removes the facts in favor
> > of a blanket
> > > assertion that is demonstrably false.
> >
> > OK, which part of is "demonstrably false"?  I think the old "should
> > generally be preferred" is too vague.  No one has come up with a case
> > where hash has shown to be faster, and a lot of cases where
> > it is slower.
>
> I agree with Tom.  Maybe it is not true for PostgreSQL that hashed
> indexes are better, but for every other database if you are doing single
> lookups and do not need to order the items sequentially, hashed indexes
> are better.  What this indicates to me is that hashed indexes could
> {potentially} be much better implemented for PostgreSQL.

Yes, our implementation needs help.  People who know other db's are
probably choosing hash thinking it is as good as btree in our code, and
it isn't.  That's why I wanted the documentation update, and why I am
suggesting the elog(NOTICE).

I have updated the documentation to specifically mention that
PostgreSQL's hashes are slower/similar to btree.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
Index: doc/src/sgml/diskusage.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/diskusage.sgml,v
retrieving revision 1.1
diff -c -r1.1 diskusage.sgml
*** doc/src/sgml/diskusage.sgml    13 Jun 2002 05:15:22 -0000    1.1
--- doc/src/sgml/diskusage.sgml    21 Jun 2002 19:06:03 -0000
***************
*** 22,31 ****
    </para>

    <para>
!    You can monitor disk space from two places; from inside
!    <application>psql</> and from the command line using
!    <application>contrib/oid2name</>. Using <application>psql</> you can
!    issue queries to see the disk usage for any table:
  <programlisting>
  play=# SELECT relfilenode, relpages
  play-# FROM pg_class
--- 22,33 ----
    </para>

    <para>
!    You can monitor disk space from three places: from
!    <application>psql</> using <command>VACUUM</> information, from
!    <application>psql</> using <application>contrib/dbsize</>, and from
!    the command line using <application>contrib/oid2name</>. Using
!    <application>psql</> on a recently vacuumed (or analyzed) database,
!    you can issue queries to see the disk usage of any table:
  <programlisting>
  play=# SELECT relfilenode, relpages
  play-# FROM pg_class
***************
*** 38,47 ****
    </para>

    <para>
!    Each page is typically 8 kilobytes. <literal>relpages</> is only
!    updated by <command>VACUUM</> and <command>ANALYZE</>. To show the
!    space used by <acronym>TOAST</> tables, use a query based on the heap
!    relfilenode:
  <programlisting>
  play=# SELECT relname, relpages
  play-# FROM pg_class
--- 40,49 ----
    </para>

    <para>
!    Each page is typically 8 kilobytes. (Remember, <literal>relpages</>
!    is only updated by <command>VACUUM</> and <command>ANALYZE</>.) To
!    show the space used by <acronym>TOAST</> tables, use a query based on
!    the heap relfilenode shown above:
  <programlisting>
  play=# SELECT relname, relpages
  play-# FROM pg_class
Index: doc/src/sgml/indices.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v
retrieving revision 1.33
diff -c -r1.33 indices.sgml
*** doc/src/sgml/indices.sgml    21 Jun 2002 16:52:00 -0000    1.33
--- doc/src/sgml/indices.sgml    21 Jun 2002 19:06:04 -0000
***************
*** 181,190 ****
  </synopsis>
     <note>
      <para>
!      Testing has shown hash indexes to be similar or slower than btree
!      indexes, and the index size and build time for hash indexes is much
!      worse. Hash indexes also suffer poor performance under high
!      concurrency. For these reasons, hash index use is discouraged.
      </para>
     </note>
    </para>
--- 181,191 ----
  </synopsis>
     <note>
      <para>
!      Testing has shown PostgreSQL's hash indexes to be similar or slower
!      than btree indexes, and the index size and build time for hash
!      indexes is much worse. Hash indexes also suffer poor performance
!      under high concurrency. For these reasons, hash index use is
!      discouraged.
      </para>
     </note>
    </para>
Index: doc/src/sgml/ref/create_index.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/create_index.sgml,v
retrieving revision 1.33
diff -c -r1.33 create_index.sgml
*** doc/src/sgml/ref/create_index.sgml    21 Jun 2002 16:52:00 -0000    1.33
--- doc/src/sgml/ref/create_index.sgml    21 Jun 2002 19:06:05 -0000
***************
*** 330,339 ****
      the <literal>=</literal> operator.
     </para>
     <para>
!      Testing has shown hash indexes to be similar or slower than btree
!      indexes, and the index size and build time for hash indexes is much
!      worse. Hash indexes also suffer poor performance under high
!      concurrency. For these reasons, hash index use is discouraged.
     </para>

     <para>
--- 330,340 ----
      the <literal>=</literal> operator.
     </para>
     <para>
!      Testing has shown PostgreSQL's hash indexes to be similar or slower
!      than btree indexes, and the index size and build time for hash
!      indexes is much worse. Hash indexes also suffer poor performance
!      under high concurrency. For these reasons, hash index use is
!      discouraged.
     </para>

     <para>

pgsql-hackers by date:

Previous
From: Rod Taylor
Date:
Subject: funcapi.h:69: syntax error before `uint'
Next
From: Bruce Momjian
Date:
Subject: Re: What is wrong with hashed index usage?