Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> Remember that the xmin/xmax fields are basically the first thing we can
>> check with any degree of strictness when examining a tuple. This means that
>> if a page is partially clobbered, but not in a way that sets off the
>> invalid-page-header checks, then the odds are very high that the first
>> detectable sign of trouble will be references to transaction numbers that
>> are far away from what the system is really using.
> I'm increasingly thinking that one of the first things I'll suggest putting
> into 8.4 is a per-page checksum after all. It was talked about a while back
> and people thought it was pointless but I think the number of reports of
> hardware and kernel bugs resulting in zeroed and corrupted pages has been
> steadily going up. If not in total than as a percentage of the total problems.
It's still pointless; a checksum does nothing to prevent data
corruption. The error report might be slightly more obvious to a novice
but it doesn't bring your data back.
Something we could possibly do now is to modify these error messages:
if the transaction number we're trying to check is obviously bogus
(beyond the current XID counter or older than the current freeze
horizon) we could report it as a corrupted XID rather than exposing
the "no such clog segment" condition.
regards, tom lane