allowing broader use of simplehash - Mailing list pgsql-hackers

From Robert Haas
Subject allowing broader use of simplehash
Date
Msg-id CA+Tgmob8oyh02NrZW=xCScB+5GyJ-jVowE3+TWTUmPF=FsGWTA@mail.gmail.com
Whole thread Raw
Responses Re: allowing broader use of simplehash  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
I recently became annoyed while working on patch A that I could not
use simplehash in shared memory, and then I became annoyed again while
working on patch B that I could not use simplehash in frontend code.
So here are a few patches for discussion.

A significant problem in either case is that a simplehash wants to
live in a memory context; no such thing exists either for data in
shared memory nor in frontend code. However, it seems to be quite easy
to provide a way for simplehash to be defined so that it doesn't care
about memory contexts. See 0001.

As far as frontend code goes, the only other problem I found is that
it makes use of elog() to signal some internal-ish messages. It seemed
to me that the easiest thing to do was, if FRONTEND is defined, use
pg_log_error(...) instead of elog(ERROR, ...). For the one instance of
elog(LOG, ...) in simplehash.h, I chose to use pg_log_info(). It's not
really equivalent, but it's probably the closest thing that currently
exists, and I think it's good enough for what's basically a debugging
message. See 0002.

I think those changes would also be enough to allow simplehash to be
used in a dynamic shared area (DSA). Using it in the main shared
memory segment seems more problematic, because simplehash relies on
being able to resize the hash table. Shared hash tables must have a
fixed maximum size, but with dynahash, we can count on being able to
use all of the entries without significant performance degradation.
simplehash, on the other hand, uses linear probing and relies on being
able to grow the hash table as a way of escaping collisions. By
default, the load factor is not permitted to drop below 0.1, so to
mimic the collision-avoidance behavior that we get in backend-private
uses of simplehash, we'd have to overallocate by 10x, which doesn't
seem desirable.

I'd really like to have an alternative to dynahash, which is awkward
to use and probably not particularly fast, but I'm not sure simplehash
is it.  Maybe what we really need is a third (or nth) hash table
implementation.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [Proposal] Level4 Warnings show many shadow vars
Next
From: Alvaro Herrera
Date:
Subject: Re: Contention on LWLock buffer_content, due to SHARED lock(?)