Re: CRC32C Parallel Computation Optimization on ARM - Mailing list pgsql-hackers

From John Naylor
Subject Re: CRC32C Parallel Computation Optimization on ARM
Date
Msg-id CANWCAZbLdjnQg4ha3ajz_YfA5jf2V8w45x+=K0EHbymDP4HytQ@mail.gmail.com
Whole thread Raw
In response to RE: CRC32C Parallel Computation Optimization on ARM  (Xiang Gao <Xiang.Gao@arm.com>)
List pgsql-hackers
On Mon, Dec 4, 2023 at 2:27 PM Xiang Gao <Xiang.Gao@arm.com> wrote:
>
> [v8 patch]

I have a couple quick thoughts on this:

1. I looked at a couple implementations of this idea, and found that
the constants used in the carryless multiply are tied to the length of
the blocks. With a lookup table we can do the 3-way algorithm on any
portion of a full block length, rather than immediately fall to doing
CRC serially. That would be faster on average. See for example
https://github.com/komrad36/CRC/tree/master , but I don't think we
actually have to fully unroll the loop like they do there.

2. With the above, we can use a larger full block size, and so on
average less time would be spent in the carryless multiply. With that,
we could possibly get away with an open coded loop in normal C rather
than a new intrinsic (also found in the above repo). That would be
more portable.

--
John Naylor
Amazon Web Services.



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Proposal for Updating CRC32C with AVX-512 Algorithm.
Next
From: Tom Lane
Date:
Subject: Re: Using Expanded Objects other than Arrays from plpgsql