Re: Proposal for Updating CRC32C with AVX-512 Algorithm. - Mailing list pgsql-hackers

From John Naylor
Subject Re: Proposal for Updating CRC32C with AVX-512 Algorithm.
Date
Msg-id CANWCAZaWe68AGkY2y7CScf-zcWfs0dGTCobKuOjKA_FQyauEQA@mail.gmail.com
Whole thread Raw
In response to Re: Proposal for Updating CRC32C with AVX-512 Algorithm.  (Andres Freund <andres@anarazel.de>)
Responses Re: Proposal for Updating CRC32C with AVX-512 Algorithm.
List pgsql-hackers
On Thu, Jun 13, 2024 at 2:37 AM Andres Freund <andres@anarazel.de> wrote:
>
> It's hard to understand, but a nonetheless helpful page is
> https://users.ece.cmu.edu/~koopman/crc/crc32.html which lists properties for
> crc32c:
> https://users.ece.cmu.edu/~koopman/crc/c32/0x8f6e37a0_len.txt
> which lists
> (0x8f6e37a0; 0x11edc6f41) <=> (0x82f63b78; 0x105ec76f1)
{2147483615,2147483615,5243,5243,177,177,47,47,20,20,8,8,6,6,1,1}| gold | (*op) iSCSI; CRC-32C; CRC-32/4 
>
> This cryptic notion AFAIU indicates that for our polynomial we can detect 2bit
> errors up to a length of 2147483615 bytes, 3 bit errors up to 2147483615, 3
> and 4 bit errors up to 5243, 5 and 6 bit errors up to 177, 7/8 bit errors up
> to 47.

One aspect of that cryptic notation that you seemed to have missed is
"(*op)" -- explained as:

*p - primitive polynomial. This has optimal length for HD=3, and good
HD=2 performance above that length.
*o - odd bit errors detected. This has a factor of (x+1) and detects
all odd bit errors (implying that even number of bit errors have an
elevated undetected error rate)
*op - odd bit errors detected plus primitive. This is a primitive
polynomial times (x+1). It has optimal length for HD=4, and detects
all odd bit errors.

This means it's not really a 32-bit checksum -- it's a 1-bit checksum
plus a 31-bit checksum. The 1-bit checksum can detect any odd number
of bit-flips. Do we really want to throw that property away?

Sure, for an even number bitflips beyond a small number, we're left
with the luck ordinary collisions, and CRC is not particularly great,
but for two messages of the same length, I'm also not sure it's all
that bad, either

--
John Naylor
Amazon Web Services



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Fix early elog(FATAL)
Next
From: Alvaro Herrera
Date:
Subject: Re: FileFallocate misbehaving on XFS