On Mon, Oct 30, 2023 at 01:48:29PM -0700, Jeff Davis wrote:
> I assume you are concerned about the call going through a function
> pointer? If so, is it possible that setting a flag and then branching
> would be better?
>
> Also, if it's a concern, should we also consider making an inlineable
> version of pg_comp_crc32c_sse42()?
I tested pg_waldump -z with 50M 65-byte records for the following
implementations on an ARM system:
* slicing-by-8 : ~3.08s
* proposed patches applied (runtime check) : ~2.44s
* only CRC intrinsics implementation compiled : ~2.42s
* forced inlining : ~2.38s
Avoiding the runtime check produced a 0.8% improvement, and forced inlining
produced another 1.7% improvement. In comparison, even the runtime check
implementation produced a 20.8% improvement over the slicing-by-8 one.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com