Re: Block-level CRC checks - Mailing list pgsql-hackers

From Decibel!
Subject Re: Block-level CRC checks
Date
Msg-id B10B0AAB-E437-4371-9DD5-6B028D0458A1@decibel.org
Whole thread Raw
In response to Re: Block-level CRC checks  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Block-level CRC checks  (Greg Stark <greg.stark@enterprisedb.com>)
Re: Block-level CRC checks  (Robert Treat <xzilla@users.sourceforge.net>)
List pgsql-hackers
On Sep 30, 2008, at 1:48 PM, Heikki Linnakangas wrote:
> This has been suggested before, and the usual objection is  
> precisely that it only protects from errors in the storage layer,  
> giving a false sense of security.

If you can come up with a mechanism for detecting non-storage errors  
as well, I'm all ears. :)

In the meantime, you're way, way more likely to experience corruption  
at the storage layer than anywhere else. We've had several corruption  
events, only one of which was memory related... and we *know* it was  
memory related because we actually got logs saying so. But with a SAN  
environment there's a lot of moving parts, all waiting to screw up  
your data:

filesystem
SAN device driver
SAN network
SAN BIOS
drive BIOS
drive

That's above things that could hose your data outside of storage:
kernel
CPU
memory
motherboard

> Doesn't some filesystems include a per-block CRC, which would  
> achieve the same thing? ZFS?


Sure, some do. We're on linux and can't run ZFS. And I'll argue that  
no linux FS is anywhere near as tested as ext3 is, which means that  
going to some other FS that offers you CRC means you're now exposing  
yourself to the possibility of issues with the FS itself. Not to  
mention that changing filesystems on a large production system is  
very painful.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



pgsql-hackers by date:

Previous
From: Joshua Drake
Date:
Subject: Re: Block-level CRC checks
Next
From: Decibel!
Date:
Subject: Bad error message