Re: Page Checksums - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: Page Checksums
Date
Msg-id 4EF073400200002500043E80@gw.wicourts.gov
Whole thread Raw
In response to Re: Page Checksums  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Dec 19, 2011 at 2:44 PM, Kevin Grittner
> <Kevin.Grittner@wicourts.gov> wrote:
>> I was thinking that we would warn when such was found, set hint
>> bits as needed, and rewrite with the new CRC.  In the unlikely
>> event that it was a torn hint-bit-only page update, it would be a
>> warning about something which is a benign side-effect of the OS
>> or hardware crash.
> 
> But that's terrible.  Surely you don't want to tell people:
> 
> WARNING:  Your database is corrupted, or maybe not.  But don't
> worry, I modified the data block so that you won't get this
> warning again.
> 
> OK, I guess I'm not sure that you don't want to tell people that. 
> But *I* don't!
Well, I would certainly change that to comply with standard message
style guidelines.  ;-)
But the alternatives I've heard so far bother me more.  It sounds
like the most-often suggested alternative is:
ERROR (or stronger?):  page checksum failed in relation 999 page 9
DETAIL:  This may not actually affect the validity of any tuples,
since it could be a flipped bit in the checksum itself or dead
space, but we're shutting you down just in case.
HINT:  You won't be able to read anything on this page, even if it
appears to be well-formed, without stopping your database and using
some arcane tool you've never heard of before to examine and
hand-modify the page.  Any query which accesses this table may fail
in the same way.
The warning level message will be followed by something more severe
if the page or a needed tuple is mangled in a way that it would not
be used.  I guess the biggest risk here is that there is real damage
to data which doesn't generate a stronger response, and the users
are ignoring warning messages.  I'm not sure what to do about that,
but the above error doesn't seem like the right solution.
Assuming we do something about the "torn page on hint-bit only
write" issue, by moving the hint bits to somewhere else or logging
their writes, what would you suggest is the right thing to do when a
page is read with a checksum which doesn't match page contents?
-Kevin


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Replication timeout units
Next
From: "Kevin Grittner"
Date:
Subject: Re: Page Checksums