Home > mailing lists

Re: RFC: Improve CPU cache locality of syscache searches - Mailing list pgsql-hackers

From	John Naylor
Subject	Re: RFC: Improve CPU cache locality of syscache searches
Date	August 6, 2021 14:48:30
Msg-id	CAFBsxsEtCJkO8vOjfBzfHdsG0TDGjt5CGRE3RG_1+0WOpyw3sg@mail.gmail.com Whole thread Raw
In response to	Re: RFC: Improve CPU cache locality of syscache searches (Andres Freund <andres@anarazel.de>)
Responses	Re: RFC: Improve CPU cache locality of syscache searches
List	pgsql-hackers

Tree view

On Thu, Aug 5, 2021 at 4:12 PM Andres Freund <andres@anarazel.de> wrote:
> I have wondered before whether we should have syscache definitions generate
> code specific to each syscache definition. I do think that'd give a good bit
> of performance boost. But I don't see a trivial way to get there without
> notational overhead.
>
> We could define syscaches in a header using a macro that's defined differently
> in syscache.c than everywhere else. The header would declare a set of
> functions for each syscache, syscache.c would define them to call an
> always_inline function with the relevant constants.
>
> Or perhaps we should move syscache definitions into the pg_*.h headers, and
> generate the relevant code as part of their processing. That seems like it
> could be nice from a modularity POV alone. And it could do better than the
> current approach, because we could hardcode the types for columns in the
> syscache definition without increasing notational overhead.

I had code laying around to generate the info needed for initialization, but I apparently deleted it and never emailed it. :-(

If we went that far, I wonder if we could specialize the cc_bucket for the actual types, rather than just number of Datums, which would further save on space. More complex, though.

Also, if comparisons got cheaper, maybe we could get away with not storing the full hash in the bucket in favor of a tag of the 16 highest bits. (The catctup would still need the full hash for invalidation, at least).

Another idea I came across while researching in-memory key-value stores is "bulk chaining" -- see [1] for an example. In that case, our example 2-cacheline bucket would have a dlist_head pointing not to the catctups, but to another bucket. So that throws away my L1/L2 analogy and uses a chain of buckets, each of which contain multiple sets of hashes/keys/pointers. That's closer to a normal collision chain, but with better cache locality. If we did that, I imagine the catctup could dispense with storing its Datum array and its dlist_node entirely.

[1] https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-lim.pdf

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: "tanghy.fnst@fujitsu.com"
Date: 06 August 2021, 14:14:21
Subject: [PATCH]Remove obsolete macro CHECKFLOATVAL in btree_gist

From: Tom Lane
Date: 06 August 2021, 14:48:40
Subject: Re: Alias collision in `refresh materialized view concurrently`

Re: RFC: Improve CPU cache locality of syscache searches - Mailing list pgsql-hackers

Previous

Next