Re: Page Checksums - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Page Checksums
Date
Msg-id CA+U5nMJVt8EXxgAtBVptYYuu0AbOtVcUOfCtgcKAQ9aLnrCH1A@mail.gmail.com
Whole thread Raw
In response to Re: Page Checksums  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Page Checksums  (Andres Freund <andres@anarazel.de>)
Re: Page Checksums  (Robert Haas <robertmhaas@gmail.com>)
Re: Page Checksums  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On Mon, Dec 19, 2011 at 4:21 AM, Josh Berkus <josh@agliodbs.com> wrote:
> On 12/18/11 5:55 PM, Greg Stark wrote:
>> There is another way to look at this problem. Perhaps it's worth
>> having a checksum *even if* there are ways for the checksum to be
>> spuriously wrong. Obviously having an invalid checksum can't be a
>> fatal error then but it might still be useful information. Rright now
>> people don't really know if their system can experience torn pages or
>> not and having some way of detecting them could be useful. And if you
>> have other unexplained symptoms then having checksum errors might be
>> enough evidence that the investigation should start with the hardware
>> and get the sysadmin looking at hardware logs and running memtest
>> sooner.
>
> Frankly, if I had torn pages, even if it was just hint bits missing, I
> would want that to be logged.  That's expected if you crash, but if you
> start seeing bad CRC warnings when you haven't had a crash?  That means
> you have a HW problem.
>
> As long as the CRC checks are by default warnings, then I don't see a
> problem with this; it's certainly better than what we have now.

It is an important problem, and also a big one, hence why it still exists.

Throwing WARNINGs for normal events would not help anybody; thousands
of false positives would just make Postgres appear to be less robust
than it really is. That would be a credibility disaster. VMWare
already have their own distro, so if they like this patch they can use
it.

The only sensible way to handle this is to change the page format as
discussed. IMHO the only sensible way that can happen is if we also
support an online upgrade feature. I will take on the online upgrade
feature if others work on the page format issues, but none of this is
possible for 9.2, ISTM.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: JSON for PG 9.2
Next
From: Andres Freund
Date:
Subject: Re: Page Checksums