On Wed, Jul 31, 2024 at 04:43:02PM -0500, Nathan Bossart wrote:
> On Wed, Jul 31, 2024 at 01:52:54PM -0700, Andres Freund wrote:
>> I think we'd be better off enabling architectural features on a per-function
>> basis, roughly like this:
>>
>> [...]
>>
>> /* FIXME: Should be gated by configure check of -mavx512vpopcntdq -mavx512bw support */
>> pg_enable_target("avx512vpopcntdq,avx512bw")
>> uint64_t
>> pg_popcount_avx512(const char *buf, int bytes)
>> ...
>
> I remember wondering why the CRC-32C code wasn't already doing something
> like this (old compiler versions? non-gcc-like compilers?), and I'm not
> sure I ever discovered the reason, so out of an abundance of caution I used
> the same approach for AVX-512. If we can convince ourselves that
> __attribute__((target("..."))) is standard enough at this point, +1 for
> moving to that.
I looked into this some more, and found the following:
* We added SSE 4.2 CRC support in April 2015 (commit 3dc2d62). gcc support
for __attribute__((target("sse4.2"))) was added in 4.9.0 (April 2014).
clang support was added in 3.8 (March 2016).
* We added ARMv8 CRC support in April 2018 (commit f044d71). gcc support
for __attribute__((target("+crc"))) was added in 6.3 (December 2016). I
didn't find precisely when clang support was added, but until 16.0.0
(March 2023), including arm_acle.h requires the -march flag [0], and you
had to use "crc" (plus sign omitted) as the target [1].
* We added AVX-512 support in April 2024 (commit 792752a). gcc support for
__attribute__((target("avx512vpopcntdq,avx512bw"))) was added in 7.1 (May
2017). clang support was added in 5.0.0 (September 2017). However, the
"xsave" target was not supported until 9.1 for gcc (May 2019) and 9.0.0
for clang (September 2019), and we need that for our AVX-512 code, too.
So, at least for the CRC code, __attribute__((target("..."))) was probably
not widely available enough yet when it was first added. Unfortunately,
the ARMv8 CRC target support (without -march) is still pretty new, but it
might be possible to switch the others to a per-function approach in v18.
[0] https://github.com/llvm/llvm-project/commit/30b67c6
[1] https://releases.llvm.org/16.0.0/tools/clang/docs/ReleaseNotes.html#arm-and-aarch64-support
--
nathan