Re: Deprecating Hash Indexes - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Deprecating Hash Indexes
Date
Msg-id 507ACE0D.1060805@dunslane.net
Whole thread Raw
In response to Deprecating Hash Indexes  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 10/14/2012 09:45 AM, Simon Riggs wrote:
> As discussed on other threads, Hash Indexes are currently a broken
> feature and bring us into disrepute.
>
> I describe them as broken because they are neither crash safe, nor
> could they properly be called unlogged indexes (or if so, why just
> hash?). And also because if somebody suggested implementing them the
> way they are now, they would be told they were insane because silent
> data corruption is not something we tolerate anymore. We know why they
> are like that, but its time to remove the rest of the historical
> research legacy. It's hard to say "We respect your data [except if you
> press here]" and be taken seriously.
>
> Suggested actions are
>
> * Put WARNINGs in the docs against the use of hash indexes, backpatch
> to 8.3. CREATE INDEX gives no warning currently, though Index Types
> does mention a caution.
>
> * Mention in the current docs that hash indexes are likely to be
> deprecated completely in future releases. Should anybody ever make
> them work, we can change that advice quickly but I don't think we're
> going to.
>
> Personally, I would like to see them removed into a contrib module to
> allow people to re-add them if they understand the risks. ISTM better
> to confiscate all foot-guns before they go off and then re-issue them
> to marksmen, at the risk of annoying the people that use them with
> full knowledge but that's likely a contentious issue.
>
> Alternate views?


I think we'd be better off to start by implementing unlogged indexes 
generally. I have a client who is using hash indexes for the performance 
gain on a very heavy insert load precisely because they are unlogged, 
and who can get away with it because all their index lookup requirements 
are equality. They are aware of the consequences of a crash, and can 
manage the risk accordingly.

But there is a lot of attraction in the idea of unlogged btree indexes 
for example. And the danger of an unlogged index is substantially less 
than that of an unlogged table. After all, your data is still there, 
quite crash-safe, and you can thus just rebuild the index after a crash.

cheers

andrew



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: pg_stat_lwlocks view - lwlocks statistics, round 2
Next
From: Gilles Darold
Date:
Subject: pg_dump restore error