Home > mailing lists

Re: allowing broader use of simplehash - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: allowing broader use of simplehash
Date	December 11, 2019 15:50:16
Msg-id	CA+TgmoZLOE_hJ+OHWmQ906xuUFCF4+tc74-W1qDtrO0mJv=-Yg@mail.gmail.com Whole thread Raw
In response to	Re: allowing broader use of simplehash (Andres Freund <andres@anarazel.de>)
Responses	Re: allowing broader use of simplehash
List	pgsql-hackers

Tree view

On Tue, Dec 10, 2019 at 4:59 PM Andres Freund <andres@anarazel.de> wrote:
> 3) For lots of one-off uses of hashtables that aren't performance
>    critical, we want a *simple* API. That IMO would mean that key/value
>    end up being separately allocated pointers, and that just a
>    comparator is provided when creating the hashtable.

I think the simplicity of the API is a key point. Some things that are
bothersome about dynahash:

- It knows about memory contexts and insists on having its own.
- You can't just use a hash table in shared memory; you have to
"attach" to it first and have an object in backend-private memory.
- The usual way of getting a shared hash table is ShmemInitHash(), but
that means that the hash table has its own named chunk and that it's
in the main shared memory segment. If you want to put it inside
another chunk or put it in DSM or whatever, it doesn't work.
- It knows about LWLocks and if it's a shared table it needs its own
tranche of them.
- hash_search() is hard to wrap your head around.

One thing I dislike about simplehash is that the #define-based
interface is somewhat hard to use. It's not that it's a bad design.
It's just you have to sit down and think for a while to figure out
which things you need to #define in order to get it to do what you
want. I'm not sure that's something that can or needs to be fixed, but
it's something to consider. Even dynahash, as annoying as it is, is in
some ways easier to get up and running.

Probably the two most common uses cases are: (1) a fixed-sized shared
memory hash table of fixed-size entries where the key is the first N
bytes of the entry and it never grows, or (2) a backend-private or
perhaps frontend hash table of fixed-size entries where the key is the
first N bytes of the entry, and it grows without limit. I think should
consider having specialized APIs for those two cases and then more
general APIs that you can use when that's not enough.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Tom Lane
Date: 11 December 2019, 15:49:13
Subject: Re: Optimization of NestLoop join in the case of guaranteed empty inner subtree

From: Tom Lane
Date: 11 December 2019, 15:52:30
Subject: Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes

Re: allowing broader use of simplehash - Mailing list pgsql-hackers

Previous

Next