Re: add AVX2 support to simd.h - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: add AVX2 support to simd.h
Date
Msg-id 20240315174149.GA3837823@nathanxps13
Whole thread Raw
In response to Re: add AVX2 support to simd.h  (John Naylor <johncnaylorls@gmail.com>)
Responses Re: add AVX2 support to simd.h
List pgsql-hackers
On Wed, Jan 10, 2024 at 09:06:08AM +0700, John Naylor wrote:
> If we have say 25 elements, I mean (for SSE2) check the first 16, then
> the last 16. Some will be checked twice, but that's okay.

I finally got around to trying this.  0001 adds this overlapping logic.
0002 is a rebased version of the AVX2 patch (it needed some updates after
commit 9f225e9).  And 0003 is a benchmark for test_lfind32().  It runs
pg_lfind32() on an array of the given size 100M times.

I've also attached the results of running this benchmark on my machine at
HEAD, after applying 0001, and after applying both 0001 and 0002.  0001
appears to work pretty well.  When there is a small "tail," it regresses a
small amount, but overall, it seems to improve more cases than it harms.
0002 does regress searches on smaller arrays quite a bit, since it
postpones the SIMD optimizations until the arrays are longer.  It might be
possible to mitigate by using 2 registers when the "tail" is long enough,
but I have yet to try that.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: MERGE ... RETURNING
Next
From: "Amonson, Paul D"
Date:
Subject: RE: Popcount optimization using AVX512