"invalid record length" after restoring pg_basebackup - Mailing list pgsql-general
From | Dennis Jacobfeuerborn |
---|---|
Subject | "invalid record length" after restoring pg_basebackup |
Date | |
Msg-id | c729cdd1-73c3-73f8-9f40-dd0fc84efb29@conversis.de Whole thread Raw |
Responses |
Re: "invalid record length" after restoring pg_basebackup
|
List | pgsql-general |
Hi, I've run into a strange issue after restoring a backup that I created using pg_basebackup on a standby instance. The command I use to create the backup is this: pg_basebackup -v --write-recovery-conf -h$BACKUP_HOST -p5432 -U$BACKUP_USER --format tar --wal-method stream --compress=2 -D "$BACKUP_DIR" This backup runs fine and produces a "base.tar.gz" and "pg_wal.tar.gz" file. The server version is 11.7. To restore the backup on another system I unpack the "base.tar.gz" file into the data directory and "pg_wal.tar.gz" into the "pg_wal" sub-directory. I then comment out the "primary_conninfo" directive in the "recovery.conf" file since I don't want to replicate from the primary on this system I just want to restore the state from the backup. All of this works fine and the logs report that the db reaches a consistent recovery state but as last entry it reports an "invalid record length": 2020-11-13 12:25:34.266 UTC [1] LOG: listening on IPv4 address "0.0.0.0", port 5432 2020-11-13 12:25:34.266 UTC [1] LOG: listening on IPv6 address "::", port 5432 2020-11-13 12:25:34.269 UTC [1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2020-11-13 12:25:34.437 UTC [20] LOG: database system was interrupted while in recovery at log time 2020-11-03 14:36:00 UTC 2020-11-13 12:25:34.437 UTC [20] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2020-11-13 12:25:35.359 UTC [20] WARNING: recovery command file "recovery.conf" specified neither primary_conninfo nor restore_command 2020-11-13 12:25:35.359 UTC [20] HINT: The database server will regularly poll the pg_wal subdirectory to check for files placed there. 2020-11-13 12:25:35.359 UTC [20] LOG: entering standby mode 2020-11-13 12:25:35.372 UTC [20] LOG: redo starts at 932/9D123E70 2020-11-13 12:26:45.435 UTC [20] LOG: consistent recovery state reached at 933/AA63F600 2020-11-13 12:26:45.435 UTC [1] LOG: database system is ready to accept read only connections 2020-11-13 12:26:45.444 UTC [20] LOG: invalid record length at 933/AA6DA660: wanted 24, got 0 Looking at the WAL data the last file is "0000000100000933000000AA" and there is a file "archive_status/0000000100000933000000A9.done" but no corresponding ".done" file for the "...AA" WAL file. Running pg_waldump on the "...AA" file shows the following at the end: $ pg_waldump 0000000100000933000000AA ... rmgr: Heap len (rec/tot): 54/ 54, tx: 1743220, lsn: 933/AA6DA510, prev 933/AA6D8F08, desc: LOCK off 12: xid 1743220: flags 0 LOCK_ONLY KEYSHR_LOCK , blkref #0: rel 1663/13117/874105 blk 0 rmgr: Heap len (rec/tot): 81/ 81, tx: 1743220, lsn: 933/AA6DA548, prev 933/AA6DA510, desc: INSERT off 19, blkref #0: rel 1663/13117/868296 blk 0 rmgr: Btree len (rec/tot): 80/ 80, tx: 1743220, lsn: 933/AA6DA5A0, prev 933/AA6DA548, desc: INSERT_LEAF off 24, blkref #0: rel 1663/13117/868299 blk 1 rmgr: Btree len (rec/tot): 72/ 72, tx: 1743220, lsn: 933/AA6DA5F0, prev 933/AA6DA5A0, desc: INSERT_LEAF off 23, blkref #0: rel 1663/13117/915254 blk 2 rmgr: Transaction len (rec/tot): 34/ 34, tx: 1743220, lsn: 933/AA6DA638, prev 933/AA6DA5F0, desc: COMMIT 2020-11-03 14:49:29.205562 UTC pg_waldump: FATAL: error in WAL record at 933/AA6DA638: invalid record length at 933/AA6DA660: wanted 24, got 0 It looks like the WAL in the backup is inconsistent at the end but I'd expect pg_basebackup to create a clean backup. Is this something I should expect or is this a problem with pg_basebackup? Regards, Dennis
pgsql-general by date: