Re: Enabling Checksums - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: Enabling Checksums
Date
Msg-id 51341A83.6030709@2ndquadrant.com
Whole thread Raw
In response to Re: Enabling Checksums  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: Enabling Checksums  (Greg Smith <greg@2ndQuadrant.com>)
Re: Enabling Checksums  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On 03/04/2013 11:18 AM, Greg Smith wrote:
> On 3/3/13 9:22 AM, Craig Ringer wrote:
>> Did you get a chance to see whether you can run it in
>> checksum-validation-and-update-off backward compatible mode? This seems
>> like an important thing to have working (and tested for) in case of
>> bugs, performance issues or other unforseen circumstances.
>
> There isn't any way to do this in the current code.  The big
> simplification Jeff introduced here, to narrow complexity toward a
> commit candidate, was to make checksumming a cluster-level decision.
> You get it for everything or not at all.
>
> The problem I posted about earlier today, where a header checksum
> error can block access to the entire relation, could be resolved with
> some sort of "ignore read checksums" GUC.  But that's impractical
> right now for the write side of things.  There have been a long list
> of metadata proposals to handle situations where part of a cluster is
> checksummed, but not all of it.  Once that sort of feature is
> implemented, it becomes a lot easier to talk about selectively
> disabling writes.
>
> As for a design of a GUC that might be useful here, the option itself
> strikes me as being like archive_mode in its general use.  There is an
> element of parameters like wal_sync_method or enable_cassert though,
> where the options available vary depending on how you built the
> cluster.  Maybe name it checksum_level with options like this:
>
> off:  only valid option if you didn't enable checksums with initdb
> enforcing:  full checksum behavior as written right now.
> unvalidated:  broken checksums on reads are ignored.
>
> The main tricky case I see in that is where you read in a page with a
> busted checksum using "unvalidated".  Ideally you wouldn't write such
> a page back out again, because it's going to hide that it's corrupted
> in some way already.  How to enforce that though?  Perhaps
> "unvalidated" only be allowed in a read-only transaction?
That sounds like a really good step for disaster recovery, yes.

I also suspect that at least in the first release it might be desirable
to have an option that essentially says "something's gone horribly wrong
and we no longer want to check or write checksums, we want a
non-checksummed DB that can still read our data from before we turned
checksumming off". Essentially, a way for someone who's trying
checksumming in production after their staging tests worked out OK to
abort and go back to the non-checksummed case without having to do a
full dump and reload.

Given that, I suspect we need a 4th state, like "disabled" or
"unvalidating_writable" where we ignore checksums completely and
maintain the checksum-enabled layout but just write padding to the
checksum fields and don't bother to check them on reading.

My key concern boils down to being able to get someone up and running
quickly and with minimal disruption if something we didn't think of goes
wrong. "Oh, you have to dump and reload your 1TB database before you can
start writing to it again" isn't going to cut it.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services




pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Partial patch status update, 3/3/13
Next
From: Cliff_Bytes
Date:
Subject: LIBPQ Implementation Requiring BYTEA Data