> From: Hannu Krosing <hannu@tm.ee>
>
> As BerkelyDB came into being by splitting index methods out of an early
> version of Postgres, it should still have some similar structure left,
> so one possibility is to check what they are doing to not be that bad.
>
> Have you tried to index your dataset into a BerkelyDB database ?
Yes, it works fine with BerkelyDB. I looked at both codes and I was
stupefied with their complexity. Even if there is a similar structure,
it must be very well disguised. Some of the data structures resemble
each other's counterparts; the only piece that is exactly the same
as one of the five BerkelyDB's hash functions.
The only useful experiment that I feel I am capable of making is
trying their __ham_hash5() function, with they claim is generally
better than the other four, for most purposes. But they warn in their
comments that there is no such thing as "a hash function" -- there
must be one for each purpose.
So another experiment I might try is writing an adapter for a
user-supplied hash -- that might help in figuring out the role of the
hash function in bin overflows. That should be easy enough to do, but
fixing or re-writing the access method itself -- I'm sorry: the level
of complexity scares me. Appears like a couple man-months
(those Mythical Man-Months :).
--Gene