Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL
Date
Msg-id 20050510172652.GF31103@decibel.org
Whole thread Raw
In response to Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL  (Mischa Sandberg <mischa.sandberg@telus.net>)
List pgsql-performance
On Tue, May 10, 2005 at 11:49:50AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <decibel@decibel.org> writes:
> > What's the challange to making it adaptive, comming up with an algorithm
> > that gives you the optimal bucket size (which I would think there's
> > research on...) or allowing the index to accommodate different bucket
> > sizes existing in the index at once? (Presumably you don't want to
> > re-write the entire index every time it looks like a different bucket
> > size would help.)
>
> Exactly.  That's (a) expensive and (b) really hard to fit into the WAL
> paradigm --- I think we could only handle it as a REINDEX.  So if it
> were adaptive at all I think we'd have to support multiple bucket sizes
> existing simultaneously in the index, and I do not see a good way to do
> that.

I'm not really familiar enough with hash indexes to know if this would
work, but if the maximum bucket size was known you could use that to
determine a maximum range of buckets to look at. In some cases, that
range would include only one bucket, otherwise it would be a set of
buckets. If you found a set of buckets, I think you could then just go
to the specific one you need.

If we assume that the maximum bucket size is one page it becomes more
realistic to take an existing large bucket and split it into several
smaller ones. This could be done on an update to the index page, or a
background process could handle it.

In any case, should this go on the TODO list?

> Allowing a bucket size to be specified at CREATE INDEX doesn't seem out
> of line though.  We'd have to think up a scheme for index-AM-specific
> index parameters ...
--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: [GENERAL] "Hash index" vs. "b-tree index" (PostgreSQL
Next
From: PFC
Date:
Subject: Re: Partitioning / Clustering