Re: Hash Indexes - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Hash Indexes
Date
Msg-id 5085.1475515720@sss.pgh.pa.us
Whole thread Raw
In response to Re: Hash Indexes  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
Jeff Janes <jeff.janes@gmail.com> writes:
> I've done a simple comparison using pgbench's default transaction, in which
> all the primary keys have been dropped and replaced with indexes of either
> hash or btree type, alternating over many rounds.

> I run 'pgbench -c16 -j16 -T 900 -M prepared' on an 8 core machine with a
> scale of 40.  All the data fits in RAM, but not in shared_buffers (128MB).

> I find a 4% improvement for hash indexes over btree indexes, 9324.744
> vs 9727.766.  The difference is significant at p-value of 1.9e-9.

Thanks for doing this work!

> The four versions of hash indexes (HEAD, concurrent, wal, cache, applied
> cumulatively) have no statistically significant difference in performance
> from each other.

Interesting.

> I think I don't see improvement in hash performance with the concurrent and
> cache patches because I don't have enough cores to get to the contention
> that those patches are targeted at.

Possibly.  However, if the cache patch is not a prerequisite to the WAL
fixes, IMO somebody would have to demonstrate that it has a measurable
performance benefit before it would get in.  It certainly doesn't look
like it's simplifying the code, so I wouldn't take it otherwise.

I think, though, that this is enough to put to bed the argument that
we should toss the hash AM entirely.  If it's already competitive with
btree today, despite the lack of attention that it's gotten, there is
reason to hope that it will be a significant win (for some use-cases,
obviously) in future.  We should now get back to reviewing these patches
on their own merits.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Hash Indexes
Next
From: Stephen Frost
Date:
Subject: Re: pgbench more operators & functions