Re: [PERFORM] BUG #2737: hash indexing large tablefails,while btree of same index works - Mailing list pgsql-bugs

From Kenneth Marshall
Subject Re: [PERFORM] BUG #2737: hash indexing large tablefails,while btree of same index works
Date
Msg-id 20061117142657.GQ12572@it.is.rice.edu
Whole thread Raw
In response to Re: [PERFORM] BUG #2737: hash indexing large tablefails,while btree of same index works  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-bugs
On Fri, Nov 17, 2006 at 10:59:11AM +0000, Simon Riggs wrote:
> On Thu, 2006-11-16 at 17:48 -0500, Tom Lane wrote:
>
> > AFAICS, any hash index exceeding a single segment is at serious risk.
> > The fact that we've not heard gripes before suggests that no one is
> > using gigabyte-sized hash indexes.
>
> Seems so.
>
> > But it seems mighty late in the beta cycle to be making subtle changes
> > in the smgr API.  What I'm inclined to do for now is to hack
> > _hash_expandtable() to write a page of zeroes at the end of each file
> > segment when an increment in hashm_ovflpoint causes the logical EOF to
> > cross segment boundary(s).  This is pretty ugly and nonmodular, but it
> > will fix the bug without risking breakage of any non-hash code.
> > I'll revisit the cleaner solution once 8.3 devel begins.  Comments?
>
> Do we think there is hope of improving hash indexes? If not, I'm
> inclined to remove them rather than to spend time bolstering them. We
> can remap the keyword as was done with RTREE. It's somewhat embarrassing
> having an index without clear benefit that can't cope across crashes. We
> wouldn't accept that in other parts of the software...
>

While I understand that there are currently serious problems in terms of
recovery with the hash index as it stands currently, it is theoretically
possible to get a result back with a single I/O probe from a hash and
that is not the case with btree indexes. At the top end of the performance
curve, every little bit that can be done to minimize actual I/Os is needed.
I certainly hold out some hope that they can improved. I would like to see
them still included. Once they are gone, it will be much harder to ever
add them back.

Ken Marshall

pgsql-bugs by date:

Previous
From: manisha balwadkar
Date:
Subject: Stored Procedure with more than one OUT Parameter
Next
From: "Mikko Tiihonen"
Date:
Subject: BUG #2768: dates before year 1600 in timestamptz column give strange results