Re: corrupt pages detected by enabling checksums - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: corrupt pages detected by enabling checksums
Date
Msg-id 1365205177.7580.3183.camel@sussancws0025
Whole thread Raw
In response to Re: corrupt pages detected by enabling checksums  (Florian Pflug <fgp@phlo.org>)
Responses Re: corrupt pages detected by enabling checksums
List pgsql-hackers
On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote:
> Maybe we could scan forward to check whether a corrupted WAL record is
> followed by one or more valid ones with sensible LSNs. If it is,
> chances are high that we haven't actually hit the end of the WAL. In
> that case, we could either log a warning, or (better, probably) abort
> crash recovery.

+1.

> Corruption of fields which we require to scan past the record would
> cause false negatives, i.e. no trigger an error even though we do
> abort recovery mid-way through. There's a risk of false positives too,
> but they require quite specific orderings of writes and thus seem
> rather unlikely. (AFAICS, the OS would have to write some parts of
> record N followed by the whole of record N+1 and then crash to cause a
> false positive).

Does the xlp_pageaddr help solve this?

Also, we'd need to be a little careful when written-but-not-flushed WAL
data makes it to disk, which could cause a false positive and may be a
fairly common case.

Regards,Jeff Davis





pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: corrupt pages detected by enabling checksums
Next
From: Noah Misch
Date:
Subject: Re: Drastic performance loss in assert-enabled build in HEAD