Re: Enabling Checksums - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Enabling Checksums
Date
Msg-id 513729BD.1010501@vmware.com
Whole thread Raw
In response to Re: Enabling Checksums  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Enabling Checksums  (Andres Freund <andres@2ndquadrant.com>)
Re: Enabling Checksums  (Garick Hamlin <ghamlin@isc.upenn.edu>)
Re: Enabling Checksums  (Craig Ringer <craig@2ndquadrant.com>)
Re: Enabling Checksums  (Greg Smith <greg@2ndQuadrant.com>)
Re: Enabling Checksums  (Ants Aasma <ants@cybertec.at>)
List pgsql-hackers
On 06.03.2013 10:41, Simon Riggs wrote:
> On 5 March 2013 18:02, Jeff Davis<pgsql@j-davis.com>  wrote:
>
>> Fletcher is probably significantly faster than CRC-16, because I'm just
>> doing int32 addition in a tight loop.
>>
>> Simon originally chose Fletcher, so perhaps he has more to say.
>
> IIRC the research showed Fletcher was significantly faster for only a
> small loss in error detection rate.
>
> It was sufficient to make our error detection>  1 million times
> better, possibly more. That seems sufficient to enable early detection
> of problems, since if we missed the first error, a second is very
> likely to be caught (etc). So I am assuming that we're trying to catch
> a pattern of errors early, rather than guarantee we can catch the very
> first error.

Fletcher's checksum is good in general, I was mainly worried about 
truncating the Fletcher-64 into two 8-bit values. I can't spot any 
obvious weakness in it, but if it's indeed faster and as good as a 
straightforward Fletcher-16, I wonder why that method is not more widely 
used.

Another thought is that perhaps something like CRC32C would be faster to 
calculate on modern hardware, and could be safely truncated to 16-bits 
using the same technique you're using to truncate the Fletcher's 
Checksum. Greg's tests showed that the overhead of CRC calculation is 
significant in some workloads, so it would be good to spend some time to 
optimize that. It'd be difficult to change the algorithm in a future 
release without breaking on-disk compatibility, so let's make sure we 
pick the best one.

- Heikki



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: WIP: index support for regexp search
Next
From: Tom Lane
Date:
Subject: Re: Materialized views WIP patch