Re: Improve CRC32C performance on SSE4.2 - Mailing list pgsql-hackers

From John Naylor
Subject Re: Improve CRC32C performance on SSE4.2
Date
Msg-id CANWCAZY1Le1tpTZauY-JzbLpk=VSerP8=GZs36Cza9iJfRnn-A@mail.gmail.com
Whole thread Raw
In response to Re: Improve CRC32C performance on SSE4.2  (John Naylor <johncnaylorls@gmail.com>)
List pgsql-hackers
On Mon, Mar 24, 2025 at 6:37 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> I'll take a look at the configure
> checks soon, since I had some questions there.

One other thing I forgot to mention: The previous test function had
local constants that the compiler was able to fold, resulting in no
actual vector instructions being emitted:

movabs  rdx, 12884901891
xor     eax, eax
crc32   rax, rdx
crc32   rax, rdx
ret

That may be okay for practical purposes, but in the spirit of commit
fdb5dd6331e30 I changed it in v15 to use global variables and made
sure it emits what the function attributes are intended for:

vmovdqu64       zmm3, ZMMWORD PTR x[rip]
xor     eax, eax
vpclmulqdq      zmm0, zmm3, ZMMWORD PTR y[rip], 0
vextracti32x4   xmm2, zmm0, 1
vmovdqa64       xmm1, xmm0
vmovdqu64       ZMMWORD PTR y[rip], zmm0
vextracti32x4   xmm0, zmm0, 2
vpternlogq      xmm1, xmm2, xmm0, 150
vmovq   rdx, xmm1
crc32   rax, rdx
vzeroupper
ret

--
John Naylor
Amazon Web Services



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: Improve CRC32C performance on SSE4.2
Next
From: Noah Misch
Date:
Subject: Re: AIO v2.5