Re: Popcount optimization using AVX512 - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Popcount optimization using AVX512
Date
Msg-id ZwXsE0KG_wh6_heU@nathan
Whole thread Raw
In response to Re: Popcount optimization using AVX512  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Wed, Jul 31, 2024 at 04:43:02PM -0500, Nathan Bossart wrote:
> On Wed, Jul 31, 2024 at 01:52:54PM -0700, Andres Freund wrote:
>> I think we'd be better off enabling architectural features on a per-function
>> basis, roughly like this:
>>
>> [...]
>> 
>> /* FIXME: Should be gated by configure check of -mavx512vpopcntdq -mavx512bw support */
>> pg_enable_target("avx512vpopcntdq,avx512bw")
>> uint64_t
>> pg_popcount_avx512(const char *buf, int bytes)
>> ...
> 
> I remember wondering why the CRC-32C code wasn't already doing something
> like this (old compiler versions? non-gcc-like compilers?), and I'm not
> sure I ever discovered the reason, so out of an abundance of caution I used
> the same approach for AVX-512.  If we can convince ourselves that
> __attribute__((target("..."))) is standard enough at this point, +1 for
> moving to that.

I looked into this some more, and found the following:

* We added SSE 4.2 CRC support in April 2015 (commit 3dc2d62).  gcc support
  for __attribute__((target("sse4.2"))) was added in 4.9.0 (April 2014).
  clang support was added in 3.8 (March 2016).

* We added ARMv8 CRC support in April 2018 (commit f044d71).  gcc support
  for __attribute__((target("+crc"))) was added in 6.3 (December 2016).  I
  didn't find precisely when clang support was added, but until 16.0.0
  (March 2023), including arm_acle.h requires the -march flag [0], and you
  had to use "crc" (plus sign omitted) as the target [1].

* We added AVX-512 support in April 2024 (commit 792752a).  gcc support for
  __attribute__((target("avx512vpopcntdq,avx512bw"))) was added in 7.1 (May
  2017).  clang support was added in 5.0.0 (September 2017).  However, the
  "xsave" target was not supported until 9.1 for gcc (May 2019) and 9.0.0
  for clang (September 2019), and we need that for our AVX-512 code, too.

So, at least for the CRC code, __attribute__((target("..."))) was probably
not widely available enough yet when it was first added.  Unfortunately,
the ARMv8 CRC target support (without -march) is still pretty new, but it
might be possible to switch the others to a per-function approach in v18.

[0] https://github.com/llvm/llvm-project/commit/30b67c6
[1] https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#arm-and-aarch64-support

-- 
nathan



pgsql-hackers by date:

Previous
From: Richard Guo
Date:
Subject: Re: Inconsistent RestrictInfo serial numbers
Next
From: Tatsuo Ishii
Date:
Subject: Re: Doc: typo in config.sgml