Re: CRC32C Parallel Computation Optimization on ARM - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: CRC32C Parallel Computation Optimization on ARM
Date
Msg-id 20231025154325.GB981848@nathanxps13
Whole thread Raw
In response to RE: CRC32C Parallel Computation Optimization on ARM  (Xiang Gao <Xiang.Gao@arm.com>)
Responses RE: CRC32C Parallel Computation Optimization on ARM
List pgsql-hackers
+pg_crc32c
+pg_comp_crc32c_with_vmull_armv8(pg_crc32c crc, const void *data, size_t len)

It looks like most of this function is duplicated from
pg_comp_crc32c_armv8().  I understand that we probably need a separate
function because of the runtime check, but perhaps we could create a common
static inline helper function with a branch for when vmull_p64() can be
used.  It's callers would then just provide a boolean to indicate which
branch to take.

+# Use ARM VMULL if available and ARM CRC32C intrinsic is avaliable too.
+if test x"$USE_ARMV8_VMULL" = x"" && (test x"$USE_ARMV8_CRC32C" = x"1" || test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK"
=x"1"); then
 
+  if test x"$pgac_armv8_vmull_intrinsics" = x"yes"; then
+    USE_ARMV8_VMULL=1
+  fi
+fi

Hm.  I wonder if we need to switch to a runtime check in some cases.  For
example, what happens if the ARMv8 intrinsics used today are found with the
default compiler flags, but vmull_p64() is only available if
-march=armv8-a+crypto is added?  It looks like the precedent is to use a
runtime check if we need extra CFLAGS to produce code that uses the
intrinsics.

Separately, I wonder if we should just always do runtime checks for the CRC
stuff whenever we can produce code with the intrinics, regardless of
whether we need extra CFLAGS.  The check doesn't look terribly expensive,
and it might allow us to simplify the code a bit (especially now that we
support a few different architectures).

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Document aggregate functions better w.r.t. ORDER BY
Next
From: Daniele Varrazzo
Date:
Subject: Re: libpq async connection and multiple hosts