Home > mailing lists

RFC: Improve CPU cache locality of syscache searches - Mailing list pgsql-hackers

From	John Naylor
Subject	RFC: Improve CPU cache locality of syscache searches
Date	August 4, 2021 19:39:29
Msg-id	CAFBsxsE35VLJ3hHkjJARB3QWqJ0zWeDw-jzqrfzkzMPuD_Ctvw@mail.gmail.com Whole thread Raw
Responses	Re: RFC: Improve CPU cache locality of syscache searches
List	pgsql-hackers

Tree view

CPU caches have multiple levels, so I had an idea to use that concept in the syscaches.

Imagine if catcache buckets were cacheline-sized. In that case, we would store the most recently accessed hash values and pointers to catctup, in addition to the dlist_head:

typedef struct cc_bucket
{
uint32 hashes[4];
catctup *ct[4];
dlist_head;
};

You can think of the bucket here as a 4-way set associative L1 and the list walk as an inclusive L2. There might be an existing term for this scheme, but I don't know what it is offhand.

Creating entries:

Instead of calling dlist_push_head(), we call dlist_push_tail() and then stash the entry in the L1 so it's still quickly available if it's accessed in the near future. This way, older entries evicted from L1 will tend to remain toward the front of the list. Walking the list will always go from oldest to newest, which might have better prefetch behavior (not sure).

Search:

Buckets with <= 4 entries would only need to access a single cacheline to get the catctup pointer with the matching hash value. If we have to walk the list to find a match, we stash it in the L1, which is faster than calling dlist_move_head().

L1 Eviction:

When putting an entry here, we memmove() everything down in each array and place the pointer and the hash value in the front, evicting the fourth (possibly NULL) entry.

The buckets would now be 4 times larger on 64-bit machines, but that might not be a problem if we just divide the initial bucket size by four as well. If the minimum nbuckets is 2, then the smaller caches would waste a bit of space, but it doesn't seem too bad. As far as when to rehash the bucket array, the fill factor would logically jump from 2 to 8. The worst-case list walk would be longer, but with a better overall memory access pattern, it might be fine.

I think it would be straightforward to code this up -- the difficulty is testing and accounting for random effects due to binary layout changes. Thoughts?

John Naylor

EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Robert Haas
Date: 04 August 2021, 17:39:01
Subject: Re: Lowering the ever-growing heap->pd_lower

From: Andres Freund
Date: 04 August 2021, 20:41:41
Subject: Re: A varint implementation for PG?

RFC: Improve CPU cache locality of syscache searches - Mailing list pgsql-hackers

Previous

Next