Re: Use simplehash.h instead of dynahash in SMgr - Mailing list pgsql-hackers

From David Rowley
Subject Re: Use simplehash.h instead of dynahash in SMgr
Date
Msg-id CAApHDvrE=vB6jtT-HEVHm_kPtKTFFW7Hu=fWGHNidq9Ah3Tv9Q@mail.gmail.com
Whole thread Raw
In response to Re: Use simplehash.h instead of dynahash in SMgr  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, 22 Jun 2021 at 02:53, Robert Haas <robertmhaas@gmail.com> wrote:
> At the risk of kibitzing the least-important detail of this proposal,
> I'm not very happy with the names of our hash implementations.
> simplehash is not especially simple, and dynahash is not particularly
> dynamic, especially now that the main place we use it is for
> shared-memory hash tables that can't be resized. Likewise, generichash
> doesn't really give any kind of clue about how this hash table is
> different from any of the others. I don't know how possible it is to
> do better here; naming things is one of the two hard problems in
> computer science. In a perfect world, though, our hash table
> implementations would be named in such a way that somebody might be
> able to look at the names and guess on that basis which one is
> best-suited to a given task.

I'm certainly open to better names.  I did almost call it stablehash,
in regards to the pointers to elements not moving around like they do
with simplehash.

I think more generally, hash table implementations are complex enough
that it's pretty much impossible to give them a short enough
meaningful name.  Most papers just end up assigning a name to some
technique. e.g Robinhood, Cuckoo etc.

Both simplehash and generichash use a variant of Robinhood hashing.
simplehash uses open addressing and generichash does not.  Instead of
Andres naming it simplehash, if he'd instead called it
"robinhoodhash", then someone might come along and complain that his
implementation is broken because it does not implement tombstoning.
Maybe Andres thought he'd avoid that by not claiming that it's an
implementation of a Robinhood hash table.  That seems pretty wise to
me. Naming it simplehash was a pretty simple way of avoiding that
problem.

Anyway, I'm open to better names, but I don't think the name should
drive the implementation.  If the implementation does not fit the name
perfectly, then the name should change rather than the implementation.

Personally, I think we should call it RowleyHash, but I think others
might object. ;-)

David



pgsql-hackers by date:

Previous
From: Ajin Cherian
Date:
Subject: Re: Added schema level support for publication.
Next
From: David Rowley
Date:
Subject: Re: Use simplehash.h instead of dynahash in SMgr