Block-level CRC checks - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Block-level CRC checks
Date
Msg-id 20080930180209.GC4851@alvh.no-ip.org
Whole thread Raw
Responses Re: Block-level CRC checks  ("Jonah H. Harris" <jonah.harris@gmail.com>)
Re: Block-level CRC checks  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Block-level CRC checks  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Re: Block-level CRC checks  (Markus Wanner <markus@bluegap.ch>)
Re: Block-level CRC checks  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Re: Block-level CRC checks  (pgsql@mohawksoft.com)
Re: Block-level CRC checks  (Bruce Momjian <bruce@momjian.us>)
Re: Block-level CRC checks  (Zdenek Kotala <Zdenek.Kotala@Sun.COM>)
Re: Block-level CRC checks  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
A customer of ours has been having trouble with corrupted data for some
time.  Of course, we've almost always blamed hardware (and we've seen
RAID controllers have their firmware upgraded, among other actions), but
the useful thing to know is when corruption has happened, and where.

So we've been tasked with adding CRCs to data files.

The idea is that these CRCs are going to be checked just after reading
files from disk, and calculated just before writing it.  They are
just a protection against the storage layer going mad; they are not
intended to protect against faulty RAM, CPU or kernel.

This code would be run-time or compile-time configurable.  I'm not
absolutely sure which yet; the problem with run-time is what to do if
the user restarts the server with the setting flipped.  It would have
almost no impact on users who don't enable it.

The implementation I'm envisioning requires the use of a new relation
fork to store the per-block CRCs.  Initially I'm aiming at a CRC32 sum
for each block.  FlushBuffer would calculate the checksum and store it
in the CRC fork; ReadBuffer_common would read the page, calculate the
checksum, and compare it to the one stored in the CRC fork.

A buffer's io_in_progress lock protects the buffer's CRC.  We read and
pin the CRC page before acquiring the lock, to avoid having two buffer
IO operations in flight.

I'd like to submit this for 8.4, but I want to ensure that -hackers at
large approve of this feature before starting serious coding.

Opinions?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: [pgadmin-hackers] Function management in PG
Next
From: Tom Lane
Date:
Subject: Re: Common Table Expressions (WITH RECURSIVE) patch