Re: Block-level CRC checks - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Block-level CRC checks
Date
Msg-id 1259615939.13774.9735.camel@ebony
Whole thread Raw
In response to Re: Block-level CRC checks  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Block-level CRC checks
List pgsql-hackers
On Mon, 2009-11-30 at 22:27 +0200, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > Proposal
> > 
> > * We reserve enough space on a disk block for a CRC check. When a dirty
> > block is written to disk we calculate and annotate the CRC value, though
> > this is *not* WAL logged.
> 
> Imagine this:
> 1. A hint bit is set. It is not WAL-logged, but the page is dirtied.
> 2. The buffer is flushed out of the buffer cache to the OS. A new CRC is
> calculated and stored on the page.
> 3. Half of the page is flushed to disk (aka torn page problem). The CRC
> made it to disk but the flipped hint bit didn't.
> 
> You now have a page with incorrect CRC on disk.

You've written that as if you are spotting a problem. It sounds to me
that this is exactly the situation we would like to detect and this is a
perfect way of doing that.

What do you see is the purpose here apart from spotting corruptions?

Do we think error rates are so low we can recover the corruption by
doing something clever with the CRC? I envisage most corruptions as
being unrecoverable except from backup/WAL/replicated servers. 

It's been a long day, so perhaps I've misunderstood.

-- Simon Riggs           www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Application name patch - v4
Next
From: Magnus Hagander
Date:
Subject: Re: OpenSSL key renegotiation with patched openssl