Thread: Is it possible for a WAL file to be missing records?
PostgreSQL version and HA extension in use
- PostgreSQL 13.10 version
- pg_auto_failover 2.0
CPU usage and load were increasing due to high load.
Failover was performed while a large number of WALwrite events occurred in the primary DB.
I confirmed that the part where the secondary was not promoted was a pg_auto_failover issue.
I promoted the secondary manually.
And I originally tried to make the primary DB a new secondary using the archived wal file, but there seemed to be a missing WAL record.
So, I opened the WAL file using pg_waldump and there was a missing record.
It was not a DB server crash.
Can records not be recorded in the WAL file even when a failover is performed due to high load?
I'm wondering if this could be considered a bug or if it was a situation where WAL records could be lost.
- PostgreSQL 13.10 version
- pg_auto_failover 2.0
CPU usage and load were increasing due to high load.
Failover was performed while a large number of WALwrite events occurred in the primary DB.
I confirmed that the part where the secondary was not promoted was a pg_auto_failover issue.
I promoted the secondary manually.
And I originally tried to make the primary DB a new secondary using the archived wal file, but there seemed to be a missing WAL record.
So, I opened the WAL file using pg_waldump and there was a missing record.
It was not a DB server crash.
Can records not be recorded in the WAL file even when a failover is performed due to high load?
I'm wondering if this could be considered a bug or if it was a situation where WAL records could be lost.
I will send you the information confirmed through DB log and pg_waldump.




I'll share some DB settings too.
hot_standby_feedback = on
hot_standby = on
synchronous_commit = on
wal_writer_flush_after = 1MB
wal_sync_method = fdatasync
wal_writer_delay = 200ms
wal_buffers = 16MB
wal_segment_size= 16MB
hot_standby_feedback = on
hot_standby = on
synchronous_commit = on
wal_writer_flush_after = 1MB
wal_sync_method = fdatasync
wal_writer_delay = 200ms
wal_buffers = 16MB
wal_segment_size= 16MB
[When the first failover occurs]
- WAL apply DB log

- Check the wal record using pg_waldump
I verified that there are no missing lsn in 0000000300005015000000A6 and 0000000300005015000000A7.
However, the prev lsn shown in 0000000300005015000000A8 is not found in 0000000300005015000000A7.
However, the prev lsn shown in 0000000300005015000000A8 is not found in 0000000300005015000000A7.
- The last LSN of 0000000300005015000000A7 is 5015/A6003778
-The prev LSN of the first record of 0000000300005015000000A8 is 5015/A7FFED78.

[When the second failover occurs]
- DB log

- Check the wal record using pg_waldump
The last LSN of 000000030000501E0000008E is 501E/8EFFCED8.The prev lsn of the first record in 000000030000501E0000008F wal file is 501E/8EFFEEC8.
It appears to have been lost due to the large difference in LSN.
