Oh, I forgot to tell I was able to recover the secondary by replacing the 00000001000000AA000000A0 from the archives into pg_wal. Then the secondary were able to finish recovery, start streaming replication and fetch subsequent wals. I wondered why there was a CHECKPOINT_SHUTDOWN record. I dig a little more: First, the filesystem on primary were full and I got : PANIC: could not write to file "pg_wal/xlogtemp.305": No space left on device The instance crashed and restarted in recovery mode. At the end of the recovery I got: checkpoint starting: end-of-recovery immediate checkpoint complete: ... Then a FATAL message: FATAL: could not write to file "pg_wal/xlogtemp.9405": No space left on device Followed by: aborting startup due to process failure Maybe it is this checkpoint which were not replicated? The primary had enough space for this record. But I don't understand how the secondary received records beginning by AA/A1... I googled about this and I found other similar issues: https://www.postgresql.org/message-id/flat/15938-8591df7e95064538%40postgresql.org https://www.postgresql.org/message-id/CAMp7vw97871F21X7FHHdmU2FXGME4HTgMYxkAubMdCU2xevmxQ%40mail.gmail.com https://www.postgresql.org/message-id/flat/E73F4CFB-E322-461E-B1EC-82FAA808FEE6%40lifetrenz.com https://www.postgresql.org/message-id/15398-b4896eebf0bed218%40postgresql.org https://www.postgresql.org/message-id/flat/15412-f9a89b026e6774d1%40postgresql.org -- Adrien NAYRAT https://blog.anayrat.info
pgsql-general by date:
Соглашаюсь с условиями обработки персональных данных