Re: Block-level CRC checks - Mailing list pgsql-hackers

From Joshua D. Drake
Subject Re: Block-level CRC checks
Date
Msg-id 1259690527.26322.30.camel@jd-desktop.iso-8859-1.charter.com
Whole thread Raw
In response to Re: Block-level CRC checks  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2009-12-01 at 10:55 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > On Tue, 2009-12-01 at 16:40 +0200, Heikki Linnakangas wrote:
> >> It's not hard to imagine that when a hardware glitch happens
> >> causing corruption, it also causes the system to crash. Recalculating
> >> the CRCs after crash would mask the corruption.
> 
> > They are already masked from us, so continuing to mask those errors
> > would not put us in a worse position.
> 
> No, it would just destroy a large part of the argument for why this
> is worth doing.  "We detect disk errors ... except for ones that happen
> during a database crash."  "Say what?"
> 
> The fundamental problem with this is the same as it's been all along:
> the tradeoff between implementation work expended, performance overhead
> added, and net number of real problems detected (with a suitably large
> demerit for actually *introducing* problems) just doesn't look
> attractive.  You can make various compromises that improve one or two of
> these factors at the cost of making the others worse, but at the end of
> the day I've still not seen a combination that seems worth doing.

Let me try a different but similar perspective. The problem we are
trying to solve here, only matters to a very small subset of the people
actually using PostgreSQL. Specifically, a percentage that is using
PostgreSQL in a situation where they can lose many thousands of dollars
per minute or hour should an outage occur.

On the other hand it is those very people that are *paying* people to
try and implement these features. Kind of a catch-22.

The hard core reality is this. *IF* it is one of the goals of this
project to insure that the software can be safely, effectively, and
responsibly operated in a manner that is acceptable to C* level people
in a Fortune level company then we *must* solve this problem.

If it is not the goal of the project, leave it to EDB/CMD/2ndQuandrant
to fork it because it will eventually happen. Our customers are
demanding these features.

Sincerely,

Joshua D. Drake


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
If the world pushes look it in the eye and GRR. Then push back harder. - Salamander



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: YAML Was: CommitFest status/management
Next
From: Greg Stark
Date:
Subject: Re: Block-level CRC checks