Thread: pg ignores wal files in pg_wal, and instead tries to load them from archive/primary
pg ignores wal files in pg_wal, and instead tries to load them from archive/primary
From
hubert depesz lubaczewski
Date:
Hi, we have following situation: 1. primary on 14.5 that is *not* archiving (this is temporary situation related to ongoing upgrade from pg 12 proces) - all on ubuntu focal. 2. on new replica we run (via wrapper, but this doesn't seem to be related): pg_basebackup -D /var/lib/postgresql/14/main -c fast -v -P -U some-user -h sourcedb.hostname 3. after it is done, if the datadir was large enough, pg on replica doesn't replicate/catchup, because, from logs: 2022-09-29 14:59:26.587 UTC,,,2355588,,6335b2ce.23f184,1,,2022-09-29 14:59:26 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-29 14:59:26.587 UTC,,,2355588,,6335b2ce.23f184,2,,2022-09-29 14:59:26 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 4. if there is restore_command configured, it tries to read data from archive too, but archive is non-existant. 5. the "missing" file is there, in pg_wal (I would assume that pg_basebackup copied it there): root@host# /bin/ls -c1 0* | wc -l 1068 root@host# /bin/ls -c1 0* | sort -V | head -n 1 00000001000007E4000000A0 root@host# /bin/ls -c1 0* | sort -V | tail -n 1 00000001000007E800000092 root@host# /bin/ls -c1 0* | sort -V | grep -n 00000001000007E800000067 1043:00000001000007E800000067 root@host# /bin/ls -c1 0* | sort -V | grep -n -C5 00000001000007E800000067 1038-00000001000007E800000062 1039-00000001000007E800000063 1040-00000001000007E800000064 1041-00000001000007E800000065 1042-00000001000007E800000066 1043:00000001000007E800000067 1044-00000001000007E800000068 1045-00000001000007E800000069 1046-00000001000007E800000070 1047-00000001000007E800000071 1048-00000001000007E800000072 6. What's more - I straced startup process, and it does: a. opens the wal file (the problematic one) b. read 8k form it c. closes it d. checks existence of finish.recovery trigger file (it doesn't exist) e. starts restore program (which fails). f. rinse and repeat What am I missing? what is wrong? How can I fix it? The problem is not fixing *this server*, because we are in process of upgrading LOTS and LOTS of servers, and I need to know what is broken/how to work around it. Currently our goto fix is: 1. increase wal_keep_size to ~ 200GB 2. standaup replica 3. once it catches up decrease wal_keep_size to standard (for us) 16GB but it is not really nice solution. Best regards, depesz
Re: pg ignores wal files in pg_wal, and instead tries to load them from archive/primary
From
Michael Paquier
Date:
On Thu, Sep 29, 2022 at 05:51:02PM +0200, hubert depesz lubaczewski wrote: > What am I missing? what is wrong? How can I fix it? The problem is not fixing > *this server*, because we are in process of upgrading LOTS and LOTS of servers, > and I need to know what is broken/how to work around it. So, your backup_label file points to 00000001000007E4000000A0 as being the first segment to begin recovery from. This means that the startup process is able to correctly kick recovery using the contents of the local pg_wal/ from 00000001000007E4000000A0 to 00000001000007E800000066. Once 00000001000007E800000067 cannot be read, the startup process fails to read this segment and decides to switch the source to get WAL from as streaming. The order should be local pg_wal/, then archiving and finally streaming if all are configured, as far as I recall. What's the error you are seeing when reading 00000001000007E800000067? Is pg_waldump able to understand its internals? This could point out to an issue with xlogreader.c, for example. Another thing that has changed in this area is related to continuation records. What are the WAL contents at the end of 00000001000007E800000066? Do you spot an OVERWRITE_CONTRECORD when using pg_waldump on it? -- Michael
Attachment
Re: pg ignores wal files in pg_wal, and instead tries to load them from archive/primary
From
hubert depesz lubaczewski
Date:
On Fri, Sep 30, 2022 at 09:49:28AM +0900, Michael Paquier wrote: > On Thu, Sep 29, 2022 at 05:51:02PM +0200, hubert depesz lubaczewski wrote: > > What am I missing? what is wrong? How can I fix it? The problem is not fixing > > *this server*, because we are in process of upgrading LOTS and LOTS of servers, > > and I need to know what is broken/how to work around it. > So, your backup_label file points to 00000001000007E4000000A0 as being > the first segment to begin recovery from. This means that the startup Well, *now* I don't have backup_label: root@c273-pg2258:/var/lib/postgresql/14/main# ls -l total 68520 -rw------- 1 postgres postgres 3 Sep 28 01:38 PG_VERSION -rw------- 1 postgres postgres 61264030 Sep 30 04:42 autoprewarm.blocks -rw------- 1 postgres postgres 231 Sep 28 01:38 backup_label.old -rw------- 1 postgres postgres 8783060 Sep 28 01:38 backup_manifest drwx------ 6 postgres postgres 74 Sep 28 01:38 base -rw------- 1 postgres postgres 61 Sep 30 04:00 current_logfiles drwx------ 2 postgres postgres 4096 Sep 29 14:25 global drwx------ 2 postgres postgres 6 Sep 28 01:38 pg_commit_ts drwx------ 2 postgres postgres 6 Sep 28 01:38 pg_dynshmem drwx------ 4 postgres postgres 84 Sep 28 03:38 pg_logical drwx------ 4 postgres postgres 48 Sep 28 01:38 pg_multixact drwx------ 2 postgres postgres 6 Sep 28 01:38 pg_notify drwx------ 2 postgres postgres 6 Sep 28 01:38 pg_replslot drwx------ 2 postgres postgres 6 Sep 28 01:38 pg_serial drwx------ 2 postgres postgres 6 Sep 28 01:38 pg_snapshots drwx------ 2 postgres postgres 6 Sep 29 14:25 pg_stat drwx------ 2 postgres postgres 43 Sep 29 14:25 pg_stat_tmp drwx------ 2 postgres postgres 122 Sep 29 14:25 pg_subtrans drwx------ 2 postgres postgres 27 Sep 28 01:38 pg_tblspc drwx------ 2 postgres postgres 6 Sep 28 01:38 pg_twophase drwx------ 3 postgres postgres 204800 Sep 28 03:38 pg_wal drwx------ 2 postgres postgres 4096 Sep 28 01:38 pg_xact -rw------- 1 postgres postgres 118 Sep 28 01:38 postgresql.auto.conf -rw------- 1 postgres postgres 130 Sep 29 14:25 postmaster.opts -rw------- 1 postgres postgres 103 Sep 29 14:25 postmaster.pid -rw-r--r-- 1 postgres postgres 0 Sep 28 01:38 standby.signal backup_label.old is: root@c273-pg2258:/var/lib/postgresql/14/main# cat backup_label.old START WAL LOCATION: 7D5/3B068338 (file 00000001000007D50000003B) CHECKPOINT LOCATION: 7D5/3B076F78 BACKUP METHOD: streamed BACKUP FROM: primary START TIME: 2022-09-27 21:06:00 UTC LABEL: pg_basebackup base backup START TIMELINE: 1 > process is able to correctly kick recovery using the contents of the > local pg_wal/ from 00000001000007E4000000A0 to 00000001000007E800000066. > Once 00000001000007E800000067 cannot be read, the startup process > fails to read this segment and decides to switch the source to get WAL > from as streaming. The order should be local pg_wal/, then archiving > and finally streaming if all are configured, as far as I recall. > What's the error you are seeing when reading 00000001000007E800000067? I don't see any errors. Let me show you some data from log: 2022-09-30 04:44:43.833 UTC,"postgres","postgres",2716081,"[local]",6336743b.2971b1,3,"authentication",2022-09-30 04:44:43UTC,4/14269,0,LOG,00000,"connection authorized: user=postgres database=postgres application_name=psql",,,,,,,,,"","clientbackend",,1147616880456321454 2022-09-30 04:44:43.917 UTC,,,2716082,,6336743b.2971b2,1,,2022-09-30 04:44:43 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-30 04:44:43.918 UTC,,,2716082,,6336743b.2971b2,2,,2022-09-30 04:44:43 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 2022-09-30 04:44:48.922 UTC,,,2716187,,63367440.29721b,1,,2022-09-30 04:44:48 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-30 04:44:48.922 UTC,,,2716187,,63367440.29721b,2,,2022-09-30 04:44:48 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 2022-09-30 04:44:51.867 UTC,,,2716207,"[local]",63367443.29722f,1,"",2022-09-30 04:44:51 UTC,,0,LOG,00000,"connection received:host=[local]",,,,,,,,,"","not initialized",,0 2022-09-30 04:44:51.868 UTC,"postgres","db",2716207,"[local]",63367443.29722f,2,"authentication",2022-09-30 04:44:51 UTC,4/14271,0,LOG,00000,"connectionauthenticated: identity=""postgres"" method=peer (/etc/postgresql/14/main/pg_hba.conf:11)",,,,,,,,,"","clientbackend",,-659920678270393094 2022-09-30 04:44:51.868 UTC,"postgres","db",2716207,"[local]",63367443.29722f,3,"authentication",2022-09-30 04:44:51 UTC,4/14271,0,LOG,00000,"connectionauthorized: user=postgres database=db application_name=psql",,,,,,,,,"","client backend",,-659920678270393094 2022-09-30 04:44:53.751 UTC,,,2716218,"127.0.0.1:39078",63367445.29723a,1,"",2022-09-30 04:44:53 UTC,,0,LOG,00000,"connectionreceived: host=127.0.0.1 port=39078",,,,,,,,,"","not initialized",,0 2022-09-30 04:44:53.927 UTC,,,2716219,,63367445.29723b,1,,2022-09-30 04:44:53 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-30 04:44:53.927 UTC,,,2716219,,63367445.29723b,2,,2022-09-30 04:44:53 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 2022-09-30 04:44:58.931 UTC,,,2716255,,6336744a.29725f,1,,2022-09-30 04:44:58 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-30 04:44:58.931 UTC,,,2716255,,6336744a.29725f,2,,2022-09-30 04:44:58 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 2022-09-30 04:45:03.936 UTC,,,2716366,,6336744f.2972ce,1,,2022-09-30 04:45:03 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-30 04:45:03.936 UTC,,,2716366,,6336744f.2972ce,2,,2022-09-30 04:45:03 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 2022-09-30 04:45:08.751 UTC,,,2716401,"127.0.0.1:45926",63367454.2972f1,1,"",2022-09-30 04:45:08 UTC,,0,LOG,00000,"connectionreceived: host=127.0.0.1 port=45926",,,,,,,,,"","not initialized",,0 2022-09-30 04:45:08.940 UTC,,,2716402,,63367454.2972f2,1,,2022-09-30 04:45:08 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-30 04:45:08.940 UTC,,,2716402,,63367454.2972f2,2,,2022-09-30 04:45:08 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 2022-09-30 04:45:12.960 UTC,,,2716425,"[local]",63367458.297309,1,"",2022-09-30 04:45:12 UTC,,0,LOG,00000,"connection received:host=[local]",,,,,,,,,"","not initialized",,0 2022-09-30 04:45:12.961 UTC,"postgres","postgres",2716425,"[local]",63367458.297309,2,"authentication",2022-09-30 04:45:12UTC,4/14277,0,LOG,00000,"connection authenticated: identity=""postgres"" method=peer (/etc/postgresql/14/main/pg_hba.conf:11)",,,,,,,,,"","clientbackend",,1147616880456321454 2022-09-30 04:45:12.961 UTC,"postgres","postgres",2716425,"[local]",63367458.297309,3,"authentication",2022-09-30 04:45:12UTC,4/14277,0,LOG,00000,"connection authorized: user=postgres database=postgres application_name=psql",,,,,,,,,"","clientbackend",,1147616880456321454 2022-09-30 04:45:13.946 UTC,,,2716431,,63367459.29730f,1,,2022-09-30 04:45:13 UTC,,0,LOG,00000,"started streaming WAL fromprimary at 7E8/67000000 on timeline 1",,,,,,,,,"","walreceiver",,0 2022-09-30 04:45:13.946 UTC,,,2716431,,63367459.29730f,2,,2022-09-30 04:45:13 UTC,,0,FATAL,08P01,"could not receive datafrom WAL stream: ERROR: requested WAL segment 00000001000007E800000067 has already been removed",,,,,,,,,"","walreceiver",,0 > Is pg_waldump able to understand its internals? This could point out > to an issue with xlogreader.c, for example. So... it can, but no records: postgres@c273-pg2258:~/14/main$ /usr/lib/postgresql/14/bin/pg_waldump 00000001000007E800000067 pg_waldump: fatal: could not find a valid record after 7E8/67000000 postgres@c273-pg2258:~/14/main$ /usr/lib/postgresql/14/bin/pg_waldump 00000001000007E800000068 pg_waldump: fatal: could not find a valid record after 7E8/68000000 but previous one works: postgres@c273-pg2258:~/14/main$ /usr/lib/postgresql/14/bin/pg_waldump 00000001000007E800000066 | tail -n 20 rmgr: Transaction len (rec/tot): 65/ 65, tx: 140005495, lsn: 7E8/66A0B010, prev 7E8/66A0AFC8, desc: COMMIT 2022-09-2801:38:20.782352 UTC; origin: node 1, lsn CE64/A5086920, at 2022-09-28 01:38:20.780890 UTC rmgr: Heap len (rec/tot): 54/ 54, tx: 140005496, lsn: 7E8/66A0B058, prev 7E8/66A0B010, desc: LOCK off 55:xid 140005496: flags 0x00 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 17963/16644/195315099 blk 83995 rmgr: Heap len (rec/tot): 165/ 165, tx: 140005496, lsn: 7E8/66A0B090, prev 7E8/66A0B058, desc: HOT_UPDATE off55 xmax 140005496 flags 0x10 KEYS_UPDATED ; new off 74 xmax 140005496, blkref #0: rel 17963/16644/195315099 blk 83995 rmgr: Transaction len (rec/tot): 65/ 65, tx: 140005496, lsn: 7E8/66A0B138, prev 7E8/66A0B090, desc: COMMIT 2022-09-2801:38:20.783036 UTC; origin: node 3, lsn CE64/A5087E00, at 2022-09-28 01:38:20.780992 UTC rmgr: XLOG len (rec/tot): 51/ 980, tx: 0, lsn: 7E8/66A0B180, prev 7E8/66A0B138, desc: FPI_FOR_HINT, blkref #0: rel 17963/16644/195326137 blk 271085 FPW rmgr: Heap len (rec/tot): 54/ 54, tx: 140005497, lsn: 7E8/66A0B558, prev 7E8/66A0B180, desc: LOCK off 13:xid 140005497: flags 0x00 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 17963/16644/195326137 blk 271085 rmgr: Heap len (rec/tot): 261/ 261, tx: 140005497, lsn: 7E8/66A0B590, prev 7E8/66A0B558, desc: HOT_UPDATE off13 xmax 140005497 flags 0x10 KEYS_UPDATED ; new off 15 xmax 140005497, blkref #0: rel 17963/16644/195326137 blk 271085 rmgr: Transaction len (rec/tot): 65/ 65, tx: 140005497, lsn: 7E8/66A0B698, prev 7E8/66A0B590, desc: COMMIT 2022-09-2801:38:20.784762 UTC; origin: node 1, lsn CE64/A508C7F0, at 2022-09-28 01:38:20.782858 UTC rmgr: Heap len (rec/tot): 54/ 54, tx: 140005498, lsn: 7E8/66A0B6E0, prev 7E8/66A0B698, desc: LOCK off 8: xid140005498: flags 0x00 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 17963/16644/195328502 blk 3038500 rmgr: Heap len (rec/tot): 54/ 54, tx: 140005498, lsn: 7E8/66A0B718, prev 7E8/66A0B6E0, desc: LOCK off 8: xid140005498: flags 0x00 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 17963/16644/195328502 blk 3038500 rmgr: Heap len (rec/tot): 1821/ 1821, tx: 140005498, lsn: 7E8/66A0B750, prev 7E8/66A0B718, desc: UPDATE off 8xmax 140005498 flags 0x10 KEYS_UPDATED ; new off 5 xmax 140005498, blkref #0: rel 17963/16644/195328502 blk 3038837, blkref#1: rel 17963/16644/195328502 blk 3038500 rmgr: Btree len (rec/tot): 64/ 64, tx: 140005498, lsn: 7E8/66A0BE70, prev 7E8/66A0B750, desc: INSERT_LEAF off158, blkref #0: rel 17963/16644/195349680 blk 78396 rmgr: Btree len (rec/tot): 72/ 72, tx: 140005498, lsn: 7E8/66A0BEB0, prev 7E8/66A0BE70, desc: INSERT_LEAF off72, blkref #0: rel 17963/16644/419942198 blk 82241 rmgr: Btree len (rec/tot): 64/ 64, tx: 140005498, lsn: 7E8/66A0BEF8, prev 7E8/66A0BEB0, desc: INSERT_LEAF off268, blkref #0: rel 17963/16644/419946394 blk 18486 rmgr: Btree len (rec/tot): 64/ 64, tx: 140005498, lsn: 7E8/66A0BF38, prev 7E8/66A0BEF8, desc: INSERT_LEAF off69, blkref #0: rel 17963/16644/419946856 blk 56634 rmgr: Btree len (rec/tot): 64/ 64, tx: 140005498, lsn: 7E8/66A0BF78, prev 7E8/66A0BF38, desc: INSERT_LEAF off98, blkref #0: rel 17963/16644/419947709 blk 19044 rmgr: Btree len (rec/tot): 64/ 64, tx: 140005498, lsn: 7E8/66A0BFB8, prev 7E8/66A0BF78, desc: INSERT_LEAF off53, blkref #0: rel 17963/16644/419948176 blk 10715 rmgr: Transaction len (rec/tot): 65/ 65, tx: 140005498, lsn: 7E8/66A0BFF8, prev 7E8/66A0BFB8, desc: COMMIT 2022-09-2801:38:20.785822 UTC; origin: node 1, lsn CE64/A5091128, at 2022-09-28 01:38:20.784288 UTC rmgr: XLOG len (rec/tot): 34/ 34, tx: 0, lsn: 7E8/66A0C058, prev 7E8/66A0BFF8, desc: BACKUP_END 7D5/3B068338 rmgr: XLOG len (rec/tot): 24/ 24, tx: 0, lsn: 7E8/66A0C080, prev 7E8/66A0C058, desc: SWITCH > Another thing that has changed in this area is related to continuation > records. What are the WAL contents at the end of > 00000001000007E800000066? Do you spot an OVERWRITE_CONTRECORD when > using pg_waldump on it? No. There is no such thing in pg_waldump on *66 file. If I understand correctly - this would suggest that the .66 file is the end of wal taken from pg_basebackup. What are the 67 and next files then? Best regards, depesz
Re: pg ignores wal files in pg_wal, and instead tries to load them from archive/primary
From
Jeff Janes
Date:
Currently our goto fix is:
1. increase wal_keep_size to ~ 200GB
2. standaup replica
3. once it catches up decrease wal_keep_size to standard (for us) 16GB
but it is not really nice solution.
If you don't have an archive and aren't using a permanent slot, then you always have the risk of blowing past your available WAL.
Increasing wal_keep_size is an option to reduce this risk. Yes, it is not so nice, but you have already declined to use the nicer options.
Where those extra unusable files came from is a curiosity, but nothing more than that. I'm guessing they were created by the wrapper you said you use, as I can't reproduce their existence with just pg_basebackup. They are probably "recycled ahead" files from the master that somehow got copied needlessly, and with a file name which does not match the file contents (since they contents are from a previous name the file had). pg_waldump isn't as helpful as it could be in those cases.
Cheers,
Jeff