Re: Substituting Checksum Algorithm (was: Enabling Checksums) - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Substituting Checksum Algorithm (was: Enabling Checksums)
Date
Msg-id 85364E17-11DD-4ED8-8D9F-47D5CA9CA2EE@phlo.org
Whole thread Raw
In response to Substituting Checksum Algorithm (was: Enabling Checksums)  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Apr23, 2013, at 09:17 , Jeff Davis <pgsql@j-davis.com> wrote:
> I'd lean toward simplicity and closer adherence to the published version
> of the algorithm rather than detecting a few more obscure error
> patterns. It looks like the modification slows down the algorithm, too.

The pattern that plain FNV1 misses are not that obscure, unfortunately.
With plain FNV1A, the n-th bit of an input word (i.e. 32-bit block) only
affects bits n through 31 of the checksum. In particular, the most
significant bit of every 32-bit block only affects the MSB of the checksum,
making the algorithm miss any even number of flipped MSBs. More generally,
any form of data corruption that affects only the top N bits are missed
at least once out of 2^N times, since changing only those bits cannot
yield more than 2^N different checksum values.

Such corruption pattern may not be especially likely, given that we're
mainly protecting against disk corruption, not memory corruption. But
quantifying how unlikely exactly seems hard, thus providing at least some
form of protection against such errors seems prudent.

In addition, even with the shift-induced slowdown, FNV1+SHIFT still
performs similarly to hardware-accelerated CRC, reaching about 6bytes/cycle
on modern Intel CPUS. This really is plenty fast - if I'm not mistaken, it
translates to well over 10 GB/s.

So overall -1 for removing the shift.

best regards,
Florian Pflug




pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Substituting Checksum Algorithm (was: Enabling Checksums)
Next
From: Ants Aasma
Date:
Subject: Re: Substituting Checksum Algorithm (was: Enabling Checksums)