Re: Hash Indexes - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Hash Indexes
Date
Msg-id CAMkU=1wYxVhEfk9Sg632Yk_oj4zyydeCVqB=d3YbQ-nH5xXzNg@mail.gmail.com
Whole thread Raw
In response to Re: Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Hash Indexes  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Sep 1, 2016 at 8:55 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I have fixed all other issues you have raised.  Updated patch is
attached with this mail.

I am finding the comments (particularly README) quite hard to follow.  There are many references to an "overflow bucket", or similar phrases.  I think these should be "overflow pages".  A bucket is a conceptual thing consisting of a primary page for that bucket and zero or more overflow pages for the same bucket.  There are no overflow buckets, unless you are referring to the new bucket to which things are being moved.

Was maintaining on-disk compatibility a major concern for this patch?  Would you do things differently if that were not a concern?  If we would benefit from a break in format, I think it would be better to do that now while hash indexes are still discouraged, rather than in a future release.

In particular, I am thinking about the need for every insert to exclusive-content-lock the meta page to increment the index-wide tuple count.  I think that this is going to be a huge bottleneck on update intensive workloads (which I don't believe have been performance tested as of yet).  I was wondering if we might not want to change that so that each bucket keeps a local count, and sweeps that up to the meta page only when it exceeds a threshold.  But this would require the bucket page to have an area to hold such a count.  Another idea would to keep not a count of tuples, but of buckets with at least one overflow page, and split when there are too many of those.  I bring it up now because it would be a shame to ignore it until 10.0 is out the door, and then need to break things in 11.0.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Install extensions using update scripts (was Re: Remove superuser() checks from pgstattuple)
Next
From: Andrew Borodin
Date:
Subject: Re: GiST penalty functions [PoC]