Re: Block-level CRC checks - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Block-level CRC checks
Date
Msg-id 24171.1259707676@sss.pgh.pa.us
Whole thread Raw
In response to Re: Block-level CRC checks  (Bruce Momjian <bruce@momjian.us>)
Responses Re: Block-level CRC checks  (Bruce Momjian <bruce@momjian.us>)
Re: Block-level CRC checks  (Greg Stark <gsstark@mit.edu>)
Re: Block-level CRC checks  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
Bruce Momjian <bruce@momjian.us> writes:
> Greg Stark wrote:
>> It should be relatively cheap to skip the hint bits in the line
>> pointers since they'll be the same bits of every 16-bit value for a
>> whole range. Alternatively we could just CRC the tuples and assume a
>> corrupted line pointer will show itself quickly. That would actually
>> make it faster than a straight CRC of the whole block -- making
>> lemonade out of lemons as it were.

I don't think "relatively cheap" is the right criterion here --- the
question to me is how many assumptions are you making in order to
compute the page's CRC.  Each assumption degrades the reliability
of the check, not to mention creating another maintenance hazard.

> Yea, I am thinking we would have to have the hint bits in the line
> pointers --- if not, we would have to reserve a lot of free space to
> hold the maximum number of tuple hint bits --- seems like a waste.

Not if you're willing to move the line pointers around.  I'd envision
an extra pointer in the page header, with a layout along the lines of
fixed-size page headerhint bitsline pointersfree spacetuples properspecial space

with the CRC covering everything except the hint bits and perhaps the
free space (depending on whether you wanted to depend on two more
pointers to be right).  We would have to move the line pointers anytime
we needed to grow the hint-bit space, and there would be a
straightforward tradeoff between how often to move the pointers versus
how much potentially-wasted space we leave at the end of the hint area.

Or we could put the hint bits after the pointers, which might be better
because the hints would be smaller == cheaper to move.

> I also like the idea that we don't need to CRC check the line pointers
> because any corruption there is going to appear immediately.  However,
> the bad news is that we wouldn't find the corruption until we try to
> access bad data and might crash.

That sounds exactly like the corruption detection system we have now.
If you think that behavior is acceptable, we can skip this whole
discussion.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Block-level CRC checks
Next
From: Bruce Momjian
Date:
Subject: Re: Block-level CRC checks