On Thu, 2006-11-16 at 17:48 -0500, Tom Lane wrote:
> AFAICS, any hash index exceeding a single segment is at serious risk.
> The fact that we've not heard gripes before suggests that no one is
> using gigabyte-sized hash indexes.
Seems so.
> But it seems mighty late in the beta cycle to be making subtle changes
> in the smgr API. What I'm inclined to do for now is to hack
> _hash_expandtable() to write a page of zeroes at the end of each file
> segment when an increment in hashm_ovflpoint causes the logical EOF to
> cross segment boundary(s). This is pretty ugly and nonmodular, but it
> will fix the bug without risking breakage of any non-hash code.
> I'll revisit the cleaner solution once 8.3 devel begins. Comments?
Do we think there is hope of improving hash indexes? If not, I'm
inclined to remove them rather than to spend time bolstering them. We
can remap the keyword as was done with RTREE. It's somewhat embarrassing
having an index without clear benefit that can't cope across crashes. We
wouldn't accept that in other parts of the software...
If there is hope, is there a specific place to look? It would be good to
brain dump some starting places for an investigation.
Does anybody have a perf test that shows hash indexes beating btrees by
any significant margin? (Not saying there isn't one...)
I can see the argument that fixed hash indexes would be faster, but
there are obviously major downsides to that approach.
--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com