use SIMD in GetPrivateRefCountEntry() - Mailing list pgsql-hackers

From Nathan Bossart
Subject use SIMD in GetPrivateRefCountEntry()
Date
Msg-id aN_iTFXhlPDmzvvO@nathan
Whole thread Raw
Responses Re: use SIMD in GetPrivateRefCountEntry()
List pgsql-hackers
(new thread)

On Wed, Sep 03, 2025 at 02:47:25PM -0400, Andres Freund wrote:
>> I see a variety for increased CPU usage:
>> 
>> 1) The private ref count infrastructure in bufmgr.c gets a bit slower once
>>    more buffers are pinned
> 
> The problem mainly seems to be that the branches in the loop at the start of
> GetPrivateRefCountEntry() are entirely unpredictable in this workload.  I had
> an old patch that tried to make it possible to use SIMD for the search, by
> using a separate array for the Buffer ids - with that gcc generates fairly
> crappy code, but does make the code branchless.
> 
> Here that substantially reduces the overhead of doing prefetching. Afterwards
> it's not a meaningful source of misses anymore.

I quickly hacked together some patches for this.  0001 adds new static
variables so that we have a separate array of the buffers and the index for
the current ReservedRefCountEntry.  0002 optimizes the linear search in
GetPrivateRefCountEntry() using our simd.h routines.  This stuff feels
expensive (see vector8_highbit_mask()'s implementation for AArch64), but if
the main goal is to avoid branches, I think this is about as "branchless"
as we can make it.  I'm going to stare at this a bit longer, but I figured
I'd get something on the lists while it is fresh in my mind.

-- 
nathan

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: split func.sgml to separated individual sgml files
Next
From: Nathan Bossart
Date:
Subject: Re: disallow big-endian on aarch64