Home > mailing lists

Re: RFC: Improve CPU cache locality of syscache searches - Mailing list pgsql-hackers

From	John Naylor
Subject	Re: RFC: Improve CPU cache locality of syscache searches
Date	August 5, 2021 16:27:49
Msg-id	CAFBsxsGkBtEVjjMLZcRQqKxUCZBauoiLBPmH3X-EDSSWd__Yug@mail.gmail.com Whole thread
In response to	Re: RFC: Improve CPU cache locality of syscache searches (Andres Freund <andres@anarazel.de>)
Responses	Re: RFC: Improve CPU cache locality of syscache searches
List	pgsql-hackers

Tree view

On Wed, Aug 4, 2021 at 3:44 PM Andres Freund <andres@anarazel.de> wrote:

> On 2021-08-04 12:39:29 -0400, John Naylor wrote:
> > typedef struct cc_bucket
> > {
> > uint32 hashes[4];
> > catctup *ct[4];
> > dlist_head;
> > };
>
> I'm not convinced that the above the right idea though. Even if the hash
> matches, you're still going to need to fetch at least catctup->keys[0] from
> a separate cacheline to be able to return the cache entry.

I see your point. It doesn't make sense to inline only part of the information needed.

> struct cc_bucket_1
> {
> uint32 hashes[3]; // 12
> // 4 bytes alignment padding
> Datum key0s[3]; // 24
> catctup *ct[3]; // 24
> // cacheline boundary
> dlist_head conflicts; // 16
> };
>
> would be better for 1 key values?
>
> It's obviously annoying to need different bucket types for different key
> counts, but given how much 3 unused key Datums waste, it seems worth paying
> for?

Yeah, it's annoying, but it does make a big difference to keep out unused Datums:

keys cachelines
3 values 4 values

1 1 1/4 1 1/2
2 1 5/8 2
3 2 2 1/2
4 2 3/8 3

Or, looking at it another way, limiting the bucket size to 2 cachelines, we can fit:

keys values
1 5
2 4
3 3
4 2

Although I'm guessing inlining just two values in the 4-key case wouldn't buy much.

> If we stuffed four values into one bucket we could potentially SIMD the hash
> and Datum comparisons ;)

;-) That's an interesting future direction to consider when we support building with x86-64-v2. It'd be pretty easy to compare a vector of hashes and quickly get the array index for the key comparisons (ignoring for the moment how to handle the rare case of multiple identical hashes). However, we currently don't memcmp() the Datums and instead call an "eqfast" function, so I don't see how that part would work in a vector setting.

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Andrew Dunstan
Date: 05 August 2021, 16:26:42
Subject: Re: very long record lines in expanded psql output

From: Platon Pronko
Date: 05 August 2021, 16:48:13
Subject: Re: very long record lines in expanded psql output

Re: RFC: Improve CPU cache locality of syscache searches - Mailing list pgsql-hackers

Previous

Next