29.5. WAL Restoration

Postgres Pro uses WAL to provide protection against some types of data corruption that may occur because of hardware faults. To validate data integrity, all the WAL records are protected by a CRC check.

However, some uncommon transient faults that can be attributed to hardware may cause checksum errors when moving WAL records. As a result, WAL changes would not apply to a replica, and the physical replication would terminate due to a fatal error accompanied by an error message about an incorrect data checksum.

Postgres Pro offers a solution to this type of issues by restoring corrupted WAL data from in-memory WAL buffers. The amount of WAL data kept in memory is specified by the wal_buffers parameter. The WAL sender process checks CRC values of WAL records before sending them to a replica. If a corrupted record is detected, the WAL sender process tries to restore it from the buffers. For extra protection, there are two copies of the buffers. If the corrupted record could not be restored, the WAL sender process aborts replication with an error. The level of error depends on the value of the wal_sender_panic_on_crc_error configuration parameter. So if you already have, or suspect you have, such hardware problems, it is recommended to enable additional WAL restoration from WAL buffers by modifying the wal_sender_check_crc parameter.