Re: creating hash indexes - Mailing list pgsql-performance

From Peter Geoghegan
Subject Re: creating hash indexes
Date
Msg-id CAH2-WzkGNid0VNCL7Sq+=F6fnzWzvDeuvQLi1MKpJfgpzwoe8Q@mail.gmail.com
Whole thread Raw
In response to creating hash indexes  (Rick Otten <rottenwindfish@gmail.com>)
List pgsql-performance
On Wed, Dec 14, 2022 at 12:03 PM Rick Otten <rottenwindfish@gmail.com> wrote:
> Assuming I can live with the slower inserts, is there any parameter in particular I can tweak that would make the
timeit takes to create the hash index closer to the btree index creation time?  In particular if I wanted to try this
ona several billion row table in a busy database? 

No. B-Tree index builds are parallelized, and are far better optimized
in general.

> -  As long as the index fits in memory, varchar btree isn't really that much slower in postgresql 14 (the way it was
afew years ago), so we'll probably just live with that for the forseeable future given the complexity of changing
thingsat the moment. 

The other things to consider are 1.) the index size after retail
inserts, 2.) the index size following some number of updates and
deletes.

Even if you just had plain inserts for your production workload, the
picture will not match your test case (which I gather just looked at
the index size after a CREATE INDEX ran). I think that B-Tree indexes
will still come out ahead if you take this growth into account, and by
quite a bit, but probably not due to any effect that your existing test case
exercises.

B-Tree indexes are good at accommodating unpredictable growth, without
ever getting terrible performance on any metric of interest. So it's
not just that they tend to have better performance on average than
hash indexes (though they do); it's that they have much more
*predictable* performance characteristics as conditions change.

--
Peter Geoghegan



pgsql-performance by date:

Previous
From: Rick Otten
Date:
Subject: creating hash indexes
Next
From: Tim Jones
Date:
Subject: time sorted UUIDs