Tom Lane <tgl@sss.pgh.pa.us> writes:
> It's not really a matter of backstopping the hardware's error detection;
> if we were trying to do that, we'd keep a CRC for every data page, which
> we don't. The real reason for the WAL CRCs is as a reliable method of
> identifying the end of WAL: when the "next record" doesn't checksum you
> know it's bogus. This is a nontrivial point because of the way that we
> re-use WAL files --- the pages beyond the last successfully written page
> aren't going to be zeroes, they'll be filled with random WAL data.
Is the random WAL data really the concern? It seems like a more reliable way
of dealing with that would be to just accompany every WAL entry with a
sequential id and stop when the next id isn't the correct one.
I thought the problem was that if the machine crashed in the middle of writing
a WAL entry you wanted to be sure to detect that. And there's no guarantee the
fsync will write out the WAL entry in order. So it's possible the end (and
beginning) of the WAL entry will be there but the middle still have been
unwritten.
The only truly reliable way to handle this would require two fsyncs per
transaction commit which would be really unfortunate.
> Personally I think CRC32 is plenty for this job, but there were those
> arguing loudly for CRC64 back when we made the decision originally ...
So given the frequency of database crashes and WAL replays if having one
failed replay every few million years is acceptable I think 32 bits is more
than enough. Frankly I think 16 bits would be enough.
--
greg