Re: Block-level CRC checks - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Block-level CRC checks
Date
Msg-id 200912012300.nB1N0ek11277@momjian.us
Whole thread Raw
In response to Re: Block-level CRC checks  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Block-level CRC checks  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Greg Stark wrote:
> >> It should be relatively cheap to skip the hint bits in the line
> >> pointers since they'll be the same bits of every 16-bit value for a
> >> whole range. Alternatively we could just CRC the tuples and assume a
> >> corrupted line pointer will show itself quickly. That would actually
> >> make it faster than a straight CRC of the whole block -- making
> >> lemonade out of lemons as it were.
> 
> I don't think "relatively cheap" is the right criterion here --- the
> question to me is how many assumptions are you making in order to
> compute the page's CRC.  Each assumption degrades the reliability
> of the check, not to mention creating another maintenance hazard.
> 
> > Yea, I am thinking we would have to have the hint bits in the line
> > pointers --- if not, we would have to reserve a lot of free space to
> > hold the maximum number of tuple hint bits --- seems like a waste.
> 
> Not if you're willing to move the line pointers around.  I'd envision
> an extra pointer in the page header, with a layout along the lines of
> 
>     fixed-size page header
>     hint bits
>     line pointers
>     free space
>     tuples proper
>     special space
> 
> with the CRC covering everything except the hint bits and perhaps the
> free space (depending on whether you wanted to depend on two more
> pointers to be right).  We would have to move the line pointers anytime
> we needed to grow the hint-bit space, and there would be a
> straightforward tradeoff between how often to move the pointers versus
> how much potentially-wasted space we leave at the end of the hint area.

I assume you don't want the hint bits in the line pointers because we
would need to lock the page?

> Or we could put the hint bits after the pointers, which might be better
> because the hints would be smaller == cheaper to move.

I don't see the value there because you would need to move the hint bits
every time you added a new line pointer.  The bigger problem is that you
would need to lock the page to update the hint bits if they move around
on the page.

> > I also like the idea that we don't need to CRC check the line pointers
> > because any corruption there is going to appear immediately.  However,
> > the bad news is that we wouldn't find the corruption until we try to
> > access bad data and might crash.
> 
> That sounds exactly like the corruption detection system we have now.
> If you think that behavior is acceptable, we can skip this whole
> discussion.

Agreed, hence the "bad" part.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Block-level CRC checks
Next
From: Bruce Momjian
Date:
Subject: Re: enable-thread-safety defaults?