Re: [HACKERS] Checksums by default? - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] Checksums by default?
Date
Msg-id CAEepm=2eewm1fR0j=i9vyoxXJj7Nv_+XJye=TBQ5iMroB+YPnw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Checksums by default?  (Petr Jelinek <petr.jelinek@2ndquadrant.com>)
Responses Re: [HACKERS] Checksums by default?
List pgsql-hackers
On Sun, Jan 22, 2017 at 7:37 AM, Stephen Frost <sfrost@snowman.net> wrote:
> Exactly, and that awareness will allow a user to prevent further data
> loss or corruption.  Slow corruption over time is a very much known and
> accepted real-world case that people do experience, as well as bit
> flipping enough for someone to write a not-that-old blog post about
> them:
>
> https://blogs.oracle.com/ksplice/entry/attack_of_the_cosmic_rays1

I have no doubt that low frequency cosmic ray bit flipping in main
memory is a real phenomenon, having worked at a company that runs
enough computers to see ECC messages in kernel logs on a regular
basis.  But our checksums can't actually help with that, can they?  We
verify checksums on the way into shared buffers, and compute new
checksums on the way back to disk, so any bit-flipping that happens in
between those two times -- while your data is a sitting duck in shared
buffers -- would not be detected by this scheme.  That's ECC's job.

So the risk being defended against is corruption while in the disk
subsystem, whatever that might consist of (and certainly that includes
more buffers in strange places that themselves are susceptible to
memory faults etc, and hopefully they have their own error detection
and correction).  Certainly the ZFS community thinks that pile of
turtles can't be trusted and that extra checks are worthwhile, and you
can find anecdotal reports and studies about filesystem corruption
being detected, for example in the links from
https://en.wikipedia.org/wiki/ZFS#Data_integrity .

So +1 for enabling it by default.  I always turn that on.

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [HACKERS] new autovacuum criterion for visible pages
Next
From: Stephen Frost
Date:
Subject: Re: [HACKERS] Checksums by default?