Re: CRC32C Parallel Computation Optimization on ARM - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: CRC32C Parallel Computation Optimization on ARM
Date
Msg-id 20231102143550.GA2152@nathanxps13
Whole thread Raw
In response to RE: CRC32C Parallel Computation Optimization on ARM  (Xiang Gao <Xiang.Gao@arm.com>)
Responses RE: CRC32C Parallel Computation Optimization on ARM
List pgsql-hackers
On Thu, Nov 02, 2023 at 06:17:20AM +0000, Xiang Gao wrote:
> After reading the discussion, I understand that in order to avoid performance
> regression in some instances, we need to try our best to avoid runtime checks.
> I don't know if I understand it correctly.

The idea is that we don't want to start forcing runtime checks on builds
where we aren't already doing runtime checks.  IOW if the compiler can use
the ARMv8 CRC instructions with the default compiler flags, we should only
use vmull_p64() if it can also be used with the default compiler flags.  I
suspect this limitation sounds worse than it actually is in practice.  The
vast majority of the buildfarm uses runtime checks, and at least some of
the platforms that don't, such as the Apple M-series machines, seem to
include +crypto by default.

Of course, if a compiler picks up +crc but not +crypto in its defaults, we
could lose the vmull_p64() optimization on that platform.  But as noted in
the other thread, we can revisit if these kinds of hypothetical situations
become reality.

> Could you please give me some suggestions about how to refine this patch?

Of course.  I think we'll ultimately want to independently check for the
availability of the new instruction like we do for the other sets of
intrinsics:

    PGAC_ARMV8_VMULL_INTRINSICS([])
    if test x"$pgac_armv8_vmull_intrinsics" != x"yes"; then
        PGAC_ARMV8_VMULL_INTRINSICS([-march=armv8-a+crypto])
    fi

My current thinking is that we'll want to add USE_ARMV8_VMULL and
USE_ARMV8_VMULL_WITH_RUNTIME_CHECK and use those to decide exactly what to
compile.  I'll admit I haven't fully thought through every detail yet, but
I'm cautiously optimistic that we can avoid too much complexity in the
autoconf/meson scripts.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Tab completion regression test failed on illumos
Next
From: Japin Li
Date:
Subject: Re: Tab completion regression test failed on illumos