Re: Improve CRC32C performance on SSE4.2 - Mailing list pgsql-hackers

From John Naylor
Subject Re: Improve CRC32C performance on SSE4.2
Date
Msg-id CANWCAZa+dtOQ50V6s85jR-B53P8MsMcGiZ9WGLAGEGiE_TtyGQ@mail.gmail.com
Whole thread Raw
In response to Re: Improve CRC32C performance on SSE4.2  (Nathan Bossart <nathandbossart@gmail.com>)
Responses RE: Improve CRC32C performance on SSE4.2
List pgsql-hackers
On Tue, Mar 11, 2025 at 4:47 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Mar 10, 2025 at 03:48:31PM +0700, John Naylor wrote:
> > On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> Overall, I wish we could avoid splitting things into separate files and
> >> adding more header file gymnastics, but maybe there isn't much better we
> >> can do without overhauling the CPU feature detection code.
> >
> > I wanted to make an attempt to make this aspect nicer. v13-0002
> > incorporates deliberately compact and simple loops for inlined
> > constant input into the dispatch function, and leaves the existing
> > code alone. This avoids code churn and saves vertical space in the
> > copied code. It needs a bit more commentary, but I hope this is a more
> > digestible prerequisite to the CLMUL algorithm -- as a reminder, it'll
> > be simpler if we can always assume non-constant input can go through a
> > function pointer.
>
> That is certainly more readable.  FWIW I think it would be entirely
> reasonable to replace the pg_crc32c_sse42.c implementation with a call to
> this new pg_comp_crc32c_dispatch() function.  Of course, you'd have to
> split things up like:
> [snip]

That could work as well. I'm thinking if we do PMULL on Arm, it might
be advantageous to keep the inline path and function paths with
distinct coding -- because of the pickier alignment on that platform,
it might not be worth pre-aligning the pointer to 8 bytes for a
20-byte constant input.

I've gone ahead and added the generated AVX-512 algorithm in v14-0005,
and added the build support and some of the runtime support from Paul
and Raghuveer's earlier patches in 0006-7. It passes CI, but I'll have
to arrange access to other hardware to verify the runtime behavior. I
think the Meson support is most of the way there, but it looks like
configure.ac got whacked around cosmetically quite a bit. If we feel
it's time to refactor things there, we'll want to split that out. In
any case, for autoconf I've pretty much kept the earlier work for now.

--
John Naylor
Amazon Web Services

Attachment

pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: bogus error message for ALTER TABLE ALTER CONSTRAINT
Next
From: Jim Jones
Date:
Subject: Re: Reducing the log spam