Re: Improve CRC32C performance on SSE4.2 - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Improve CRC32C performance on SSE4.2
Date
Msg-id Z-wTcyia0z7i2Qnb@nathan
Whole thread Raw
In response to Re: Improve CRC32C performance on SSE4.2  (John Naylor <johncnaylorls@gmail.com>)
Responses Re: Improve CRC32C performance on SSE4.2
List pgsql-hackers
On Tue, Apr 01, 2025 at 05:33:02PM +0700, John Naylor wrote:
> On Thu, Mar 27, 2025 at 2:55 AM Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com> wrote:
>> (1) zmm_regs_available() in pg_crc32c_sse42_choose.c is a duplicate of
>> pg_popcount_avx512.c but perhaps that’s fine for now. I will propose a
>> consolidated SIMD runtime check in a follow up patch.
> 
> Yeah, I was thinking a small amount of duplication is tolerable.

+1.  This is easy enough to change in the future if/when we add some sort
of centralized CPU feature detection.

>> (2) Might be apt to rename pg_crc32c_sse42*.c to pg_crc32c_x86*.c  since
>> they contain both sse42 and avx512 versions.
> 
> The name is now not quite accurate, but it's not exactly misleading
> either. I'm leaning towards keeping it the same, so for now I've just
> updated the header comment.

I'm not too worried about this one either.  FWIW I'm likely going to look
into moving all the x86_64 popcount stuff into pg_popcount_avx512.c and
renaming it to pg_popcount_x86_64.c for v19.  This would parallel
pg_popcount_aarch64.c a bit better, and a file per architecture seems like
a logical way to neatly organize things.

> For v16, I made another pass through made some more mostly superficial
> adjustments:
> - copied rewritten Meson comment to configure.ac
> - added some more #include guards out of paranoia
> - added tests with longer lengths that exercise the new paths
> - adjusted configure / meson messages for consistency
> - changed not-quite-accurate wording about "AVX-512 CRC instructions"
> - used "PCLMUL" only when talking about specific intrinsics and prefer
> "AVX-512" elsewhere, to head off potential future confusion with Arm
> PMULL.

I read through the code a couple of times and nothing stood out to me.

-- 
nathan



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: AIO v2.5
Next
From: Andres Freund
Date:
Subject: Re: AIO v2.5