Re: Use simplehash.h instead of dynahash in SMgr - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Use simplehash.h instead of dynahash in SMgr
Date
Msg-id CA+hUKGJVs32f68JXnQrdTrfbpmh5-V9p_76Knf6vVzoVTUuQtg@mail.gmail.com
Whole thread Raw
In response to Re: Use simplehash.h instead of dynahash in SMgr  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Use simplehash.h instead of dynahash in SMgr
List pgsql-hackers
On Tue, Jun 22, 2021 at 6:51 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I guess we could also ask ourselves how many join algorithms we need.

David and I discussed this a bit off-list, and I just wanted to share
how I understand the idea so far in case it helps someone else.  There
are essentially three subcomponents working together:

1.  A data structure similar in some ways to a C++ std::deque<T>,
which gives O(1) access to elements by index, is densely packed to
enable cache-friendly scanning of all elements, has stable addresses
(as long as you only add new elements at the end or overwrite existing
slots), and is internally backed by an array of pointers to a set of
chunks.

2.  A bitmapset that tracks unused elements in 1, making it easy to
find the lowest-index hole when looking for a place to put a new one
by linear search for a 1 bit, so that we tend towards maximum density
despite having random frees from time to time (seems good, the same
idea is used in  kernels to allocate the lowest unused file descriptor
number).

3. A hash table that has as elements indexes into 1.  It somehow hides
the difference between keys (what callers look things up with) and
keys reachable by following an index into 1 (where elements' keys
live).

One thought is that you could do 1 as a separate component as the
"primary" data structure, and use a plain old simplehash for 3 as a
kind of index into it, but use pointers (rather than indexes) to
objects in 1 as elements.  I don't know if it's better.



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Maintaining a list of pgindent commits for "git blame" to ignore
Next
From: Yugo NAGATA
Date:
Subject: Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors