On Sun, 2013-03-03 at 22:18 -0500, Greg Smith wrote:
> As for a design of a GUC that might be useful here, the option itself
> strikes me as being like archive_mode in its general use. There is an
> element of parameters like wal_sync_method or enable_cassert though,
> where the options available vary depending on how you built the cluster.
> Maybe name it checksum_level with options like this:
>
> off: only valid option if you didn't enable checksums with initdb
> enforcing: full checksum behavior as written right now.
> unvalidated: broken checksums on reads are ignored.
I think GUCs should be orthogonal to initdb settings. If nothing else,
it's extra effort to get initdb to write the right postgresql.conf.
A single new GUC that prevents checksum failures from causing an error
seems sufficient to address the concerns you, Dan, and Craig raised.
We would still calculate the checksum and print the warning; and then
pass it through the rest of the header checks. If the header checks
pass, then it proceeds. If the header checks fail, and if
zero_damaged_pages is off, then it would still generate an error (as
today).
So: ignore_checksum_failures = on|off ?
> The main tricky case I see in that is where you read in a page with a
> busted checksum using "unvalidated". Ideally you wouldn't write such a
> page back out again, because it's going to hide that it's corrupted in
> some way already. How to enforce that though? Perhaps "unvalidated"
> only be allowed in a read-only transaction?
That's a good point. But we already have zero_damaged_pages, which does
something similar. And it's supposed to be a recovery option to get the
data out rather than something to run in online mode. It will still
print the warning, so it won't completely hide the corruption.
Regards,Jeff Davis