+ * For This Function:
+ * Copyright 2015 The Chromium Authors
I went and looked at the Chromium source, and found the following
snippet that uses the same technique, but only requires 128-bit CLMUL
and has a minimum input size of 64 bytes, rather than 256. This seems
like it might be better suited for shorter inputs. Also seems much
easier than trying to get the AVX-512 hippo to dance. It uses the IEEE
polynomial, so would need new constants calculated for ours, but that
had to be done for the shared patch, too.
https://github.com/chromium/chromium/blob/main/third_party/zlib/crc32_simd.c#L215
--
John Naylor
Amazon Web Services