Re: Detecting corrupted pages earlier - Mailing list pgsql-hackers

From Sailesh Krishnamurthy
Subject Re: Detecting corrupted pages earlier
Date
Msg-id bxyadgu4rj8.fsf@datafix.CS.Berkeley.EDU
Whole thread Raw
In response to Detecting corrupted pages earlier  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> Postgres has a bad habit of becoming very confused if the   Tom> page header of a page on disk has become
corrupted. In   Tom> particular, bogus values in the pd_lower field tend to make
 

I haven't read this piece of pgsql code very carefully so I apologize
if what I suggest is already present.

One "standard" solution to handle disk page corruption is the use of
"consistency" bits.

The idea is that the bit that starts every 256th byte of a page is a
consistency bit. In a 8K page, you'd have 32 consistency bits.  If the
page is in a "consistent" state, then all 32 bits will be either 0 or
1. When a page is written to disk, the "actual" bit in each c-bit
position is copied out and placed in the header (end/beginning) of the
page. With a 8K page, there will be one word that contains the
"actual" bit. Then the c-bits are all either set or reset depending on
the state when the page was last read: if on read time the c-bits were
set, then on write time they are reset. So when you read a page, if
some of the consistency bits are set and some others are reset then
you know that there was a corruption.

This is of course based on the assumption that most disk arms manage
to atomically write 256 bytes at a time. 

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Detecting corrupted pages earlier
Next
From: mlw
Date:
Subject: Re: new Configuration patch, implements 'include'