On Tue, Mar 11, 2025 at 4:47 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Mar 10, 2025 at 03:48:31PM +0700, John Naylor wrote:
> > On Tue, Mar 4, 2025 at 2:11 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> Overall, I wish we could avoid splitting things into separate files and
> >> adding more header file gymnastics, but maybe there isn't much better we
> >> can do without overhauling the CPU feature detection code.
> >
> > I wanted to make an attempt to make this aspect nicer. v13-0002
> > incorporates deliberately compact and simple loops for inlined
> > constant input into the dispatch function, and leaves the existing
> > code alone. This avoids code churn and saves vertical space in the
> > copied code. It needs a bit more commentary, but I hope this is a more
> > digestible prerequisite to the CLMUL algorithm -- as a reminder, it'll
> > be simpler if we can always assume non-constant input can go through a
> > function pointer.
>
> That is certainly more readable. FWIW I think it would be entirely
> reasonable to replace the pg_crc32c_sse42.c implementation with a call to
> this new pg_comp_crc32c_dispatch() function. Of course, you'd have to
> split things up like:
> [snip]
That could work as well. I'm thinking if we do PMULL on Arm, it might
be advantageous to keep the inline path and function paths with
distinct coding -- because of the pickier alignment on that platform,
it might not be worth pre-aligning the pointer to 8 bytes for a
20-byte constant input.
I've gone ahead and added the generated AVX-512 algorithm in v14-0005,
and added the build support and some of the runtime support from Paul
and Raghuveer's earlier patches in 0006-7. It passes CI, but I'll have
to arrange access to other hardware to verify the runtime behavior. I
think the Meson support is most of the way there, but it looks like
configure.ac got whacked around cosmetically quite a bit. If we feel
it's time to refactor things there, we'll want to split that out. In
any case, for autoconf I've pretty much kept the earlier work for now.
--
John Naylor
Amazon Web Services