Re: corrupt pages detected by enabling checksums - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: corrupt pages detected by enabling checksums
Date
Msg-id 1368137927.24407.85.camel@jdavis
Whole thread Raw
In response to Re: corrupt pages detected by enabling checksums  (Jim Nasby <jim@nasby.net>)
Responses Re: corrupt pages detected by enabling checksums
List pgsql-hackers
On Thu, 2013-05-09 at 14:28 -0500, Jim Nasby wrote:
> What about moving some critical data from the beginning of the WAL
> record to the end? That would make it easier to detect that we don't
> have a complete record. It wouldn't necessarily replace the CRC
> though, so maybe that's not good enough.
> 
> Actually, what if we actually *duplicated* some of the same WAL header
> info at the end of the record? Given a reasonable amount of data that
> would damn-near ensure that a torn record was detected, because the
> odds of having the exact same sequence of random bytes would be so
> low. Potentially even just duplicating the LSN would suffice.

I think both of these ideas have some false positives and false
negatives.

If the corruption happens at the record boundary, and wipes out the
special information at the end of the record, then you might think it
was not fully flushed, and we're in the same position as today.

If the WAL record is large, and somehow the beginning and the end get
written to disk but not the middle, then it will look like corruption;
but really the WAL was just not completely flushed. This seems pretty
unlikely, but not impossible.

That being said, I like the idea of introducing some extra checks if a
perfect solution is not possible.

> On the separate write idea, if that could be controlled by a GUC I
> think it'd be worth doing. Anyone that needs to worry about this
> corner case probably has hardware that would support that.

It sounds pretty easy to do that naively. I'm just worried that the
performance will be so bad for so many users that it's not a very
reasonable choice.

Today, it would probably make more sense to just use sync rep. If the
master's WAL is corrupt, and it starts up too early, then that should be
obvious when you try to reconnect streaming replication. I haven't tried
it, but I'm assuming that it gives a useful error message.

Regards,Jeff Davis





pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: corrupt pages detected by enabling checksums
Next
From: Bruce Momjian
Date:
Subject: Re: Re: [GENERAL] pg_upgrade fails, "mismatch of relation OID" - 9.1.9 to 9.2.4