Re: Block-level CRC checks - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Block-level CRC checks
Date
Msg-id 200912012149.nB1Ln6V12367@momjian.us
Whole thread Raw
In response to Re: Block-level CRC checks  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Block-level CRC checks  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Block-level CRC checks  (Richard Huxton <dev@archonet.com>)
List pgsql-hackers
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > OK, here is another idea, maybe crazy:
> 
> > When we read in a page that has an invalid CRC, we check the page to see
> > which hint bits are _not_ set, and we try setting them to see if can get
> > a matching CRC.  If there no missing hint bits and the CRC doesn't
> > match, we know the page is corrupt.  If two hint bits are missing, we
> > can try setting one and both of them and see if can get a matching CRC. 
> > If we can, the page is OK, if not, it is corrupt.
> 
> > Now if 32 hint bits are missing, but could be based on transaction
> > status, then we would need 2^32 possible hint bit combinations, so we
> > can't do the test and we just assume the page is valid.
> 
> A typical page is going to have something like 100 tuples, so
> potentially 2^400 combinations to try.  I don't see this being
> realistic from that standpoint.  What's much worse is that to even
> find the potentially missing hint bits, you need to make very strong
> assumptions about the validity of the rest of the page.
> 
> The suggestions that were made upthread about moving the hint bits
> could resolve the second objection, but once you do that you might
> as well just exclude them from the CRC and eliminate the guessing.

OK, crazy idea #3.  What if we had a per-page counter of the number of
hint bits set --- that way, we would only consider a CRC check failure
to be corruption if the count matched the hint bit count on the page.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: A thought about regex versus multibyte character sets
Next
From: Tom Lane
Date:
Subject: Re: A thought about regex versus multibyte character sets