Re: [PATCH] SVE popcount support - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: [PATCH] SVE popcount support
Date
Msg-id Z94xjuN9X7J9lSdT@nathan
Whole thread Raw
In response to Re: [PATCH] SVE popcount support  ("Chiranmoy.Bhattacharya@fujitsu.com" <Chiranmoy.Bhattacharya@fujitsu.com>)
Responses Re: [PATCH] SVE popcount support
Re: [PATCH] SVE popcount support
List pgsql-hackers
I've been preparing these for commit, and I've attached what I have so far.
A few notes:

* 0001 just renames the TRY_POPCNT_FAST macro to indicate that it's
  x86_64-specific.  IMO this is worth doing indpendent of this patch set,
  but it's more important with the patch set since we need something
  similar for Aarch64.  I think we should also consider moving the x86_64
  stuff to its own file (perhaps combining it with the AVX-512 stuff), but
  that can probably wait until later.

* 0002 introduces the Neon implementation, which conveniently doesn't need
  configure-time checks or function pointers.  I noticed that some
  compilers (e.g., Apple clang 16) compile in Neon instructions already,
  but our hand-rolled implementation is better about instruction-level
  parallelism and seems to still be quite a bit faster.

* 0003 introduces the SVE implementation.  You'll notice I've moved all the
  function pointer gymnastics into the pg_popcount_aarch64.c file, which is
  where the Neon implementations live, too.  I also tried to clean up the
  configure checks a bit.  I imagine it's possible to make them more
  compact, but I felt that the enhanced readability was worth it.

* For both Neon and SVE, I do see improvements with looping over 4
  registers at a time, so IMHO it's worth doing so even if it performs the
  same as 2-register blocks on some hardware.  I did add a 2-register block
  in the Neon implementation for processing the tail because I was worried
  about its performance on smaller buffers, but that part might get removed
  if I can't measure any difference.

I'm planning to run several more benchmarks, but everything I've seen thus
far has looked pretty good.

-- 
nathan

Attachment

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Next
From: Masahiko Sawada
Date:
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations