BUG #19032: In restore_command %f parameter does not support WAL partial files. - Mailing list pgsql-bugs
From | PG Bug reporting form |
---|---|
Subject | BUG #19032: In restore_command %f parameter does not support WAL partial files. |
Date | |
Msg-id | 19032-7904cffd24dd96a5@postgresql.org Whole thread Raw |
Responses |
Re: BUG #19032: In restore_command %f parameter does not support WAL partial files.
|
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 19032 Logged by: Wojciech Szenajch Email address: wszenajch@gmail.com PostgreSQL version: 17.6 Operating system: Ubuntu 24.04.03 LTS Description: Using restore_command = 'cp /tmp/receivewal/%f %p' does not use WAL partial file during backup restore process. %p value is used by Postgres to copy WAL and history files and is filled by Postgres internally without any Administrator influence. Example: 1. Contents of /tmp/receivewal: 0000000300000001000000E6 0000000300000001000000E7 0000000300000001000000E8 0000000300000001000000E9 0000000300000001000000EA ... 000000030000000200000004 000000030000000200000005 000000030000000200000006.partial 00000003.history 2. Log from restore process using restore_command = 'cp /tmp/receivewal/%f %p' : 2025-08-25 11:15:48.100 CEST [3326] LOG: starting PostgreSQL 17.6 (Ubuntu 17.6-1.pgdg24.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, 64-bit 2025-08-25 11:15:48.100 CEST [3326] LOG: listening on IPv4 address "0.0.0.0", port 5432 2025-08-25 11:15:48.101 CEST [3326] LOG: listening on IPv6 address "::", port 5432 2025-08-25 11:15:48.111 CEST [3326] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 2025-08-25 11:15:48.139 CEST [3330] LOG: database system was interrupted; last known up at 2025-08-25 10:28:23 CEST cp: cannot stat '/tmp/receivewal/00000004.history': No such file or directory 2025-08-25 11:15:51.197 CEST [3330] LOG: starting backup recovery with redo LSN 1/E7000028, checkpoint LSN 1/E7000080, on timeline ID 3 2025-08-25 11:15:51.213 CEST [3330] LOG: restored log file "00000003.history" from archive 2025-08-25 11:15:51.318 CEST [3330] LOG: restored log file "0000000300000001000000E7" from archive 2025-08-25 11:15:51.439 CEST [3330] LOG: starting archive recovery 2025-08-25 11:15:51.463 CEST [3330] LOG: redo starts at 1/E7000028 2025-08-25 11:15:51.526 CEST [3330] LOG: restored log file "0000000300000001000000E8" from archive 2025-08-25 11:15:51.647 CEST [3330] LOG: completed backup recovery with redo LSN 1/E7000028 and end LSN 1/E7000120 2025-08-25 11:15:51.647 CEST [3330] LOG: consistent recovery state reached at 1/E7000120 2025-08-25 11:15:51.647 CEST [3326] LOG: database system is ready to accept read-only connections 2025-08-25 11:15:51.860 CEST [3330] LOG: restored log file "0000000300000001000000E9" from archive 2025-08-25 11:15:52.111 CEST [3330] LOG: restored log file "0000000300000001000000EA" from archive ... 2025-08-25 11:16:04.450 CEST [3328] LOG: recovery restart point at 1/FC09F7B0 2025-08-25 11:16:04.450 CEST [3328] DETAIL: Last completed transaction was at log time 2025-08-25 10:29:45.078323+02. 2025-08-25 11:16:04.450 CEST [3328] LOG: restartpoint starting: wal 2025-08-25 11:16:04.634 CEST [3330] LOG: restored log file "000000030000000200000004" from archive 2025-08-25 11:16:04.821 CEST [3330] LOG: restored log file "000000030000000200000005" from archive cp: cannot stat '/tmp/receivewal/000000030000000200000006': No such file or directory 2025-08-25 11:16:05.218 CEST [3330] LOG: redo done at 2/5F8CD38 system usage: CPU: user: 2.09 s, system: 4.27 s, elapsed: 13.75 s 2025-08-25 11:16:05.218 CEST [3330] LOG: last completed transaction was at log time 2025-08-25 10:31:27.632146+02 2025-08-25 11:16:05.269 CEST [3330] LOG: restored log file "000000030000000200000005" from archive cp: cannot stat '/tmp/receivewal/00000004.history': No such file or directory 2025-08-25 11:16:05.437 CEST [3330] LOG: selected new timeline ID: 4 2025-08-25 11:16:05.564 CEST [3330] LOG: restored log file "00000003.history" from archive 2025-08-25 11:16:05.583 CEST [3330] LOG: archive recovery complete 2025-08-25 11:16:06.625 CEST [3328] LOG: restartpoint complete: wrote 6392 buffers (39.0%); 1 WAL file(s) added, 2 removed, 0 recycled; write=1.443 s, sync=0.568 s, total=2.175 s; sync files=13, longest=0.444 s, average=0.044 s; distance=49438 kB, estimate=88596 kB; lsn=2/20B7918, redo lsn=1/FF0E70A8 2025-08-25 11:16:06.626 CEST [3328] LOG: recovery restart point at 1/FF0E70A8 2025-08-25 11:16:06.626 CEST [3328] DETAIL: Last completed transaction was at log time 2025-08-25 10:31:27.632146+02. 2025-08-25 11:16:06.626 CEST [3328] LOG: checkpoints are occurring too frequently (2 seconds apart) 2025-08-25 11:16:06.626 CEST [3328] HINT: Consider increasing the configuration parameter "max_wal_size". 2025-08-25 11:16:06.640 CEST [3328] LOG: checkpoint starting: end-of-recovery immediate wait wal 2025-08-25 11:16:07.033 CEST [3328] LOG: checkpoint complete: wrote 4125 buffers (25.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.278 s, sync=0.057 s, total=0.407 s; sync files=32, longest=0.013 s, average=0.002 s; distance=113534 kB, estimate=113534 kB; lsn=2/5FC6C38, redo lsn=2/5FC6C38 2025-08-25 11:16:07.068 CEST [3326] LOG: database system is ready to accept connections The last used is "restored log file "000000030000000200000005" from archive". Partial file is not used because is not copied: cp: cannot stat '/tmp/receivewal/000000030000000200000006': No such file or directory. When I used the following parameter value: restore_command = 'cp /tmp/receivewal/%f %p || cp /tmp/receivewal/%f.partial %p' to include files with partial extension, file 000000030000000200000006.partial was copied and seemed to be processed during restore (maybe it had no any valuable data in my experiment). The end of the log: ... 2025-08-25 11:06:46.703 CEST [3120] LOG: restored log file "000000030000000200000005" from archive cp: cannot stat '/tmp/receivewal/000000030000000200000006': No such file or directory 2025-08-25 11:06:47.604 CEST [3120] LOG: restored log file "000000030000000200000006" from archive 2025-08-25 11:06:47.670 CEST [3118] LOG: restartpoint complete: wrote 5058 buffers (30.9%); 0 WAL file(s) added, 3 removed, 0 recycled; write=1.791 s, sync=0.506 s, total=2.469 s; sync files=5, longest=0.303 s, average=0.102 s; distance=49438 kB, estimate=81497 kB; lsn=2/20B7918, redo lsn=1/FF0E70A8 2025-08-25 11:06:47.670 CEST [3118] LOG: recovery restart point at 1/FF0E70A8 2025-08-25 11:06:47.670 CEST [3118] DETAIL: Last completed transaction was at log time 2025-08-25 10:31:27.632146+02. 2025-08-25 11:06:47.671 CEST [3118] LOG: restartpoint starting: wal 2025-08-25 11:06:48.062 CEST [3120] LOG: invalid magic number 0000 in WAL segment 000000030000000200000006, LSN 2/686C000, offset 8830976 2025-08-25 11:06:48.063 CEST [3120] LOG: redo done at 2/686BFB0 system usage: CPU: user: 2.54 s, system: 6.48 s, elapsed: 17.16 s 2025-08-25 11:06:48.063 CEST [3120] LOG: last completed transaction was at log time 2025-08-25 10:31:45.384122+02 cp: cannot stat '/tmp/receivewal/000000030000000200000006': No such file or directory 2025-08-25 11:06:48.143 CEST [3120] LOG: restored log file "000000030000000200000006" from archive cp: cannot stat '/tmp/receivewal/00000004.history': No such file or directory cp: cannot stat '/tmp/receivewal/00000004.history.partial': No such file or directory 2025-08-25 11:06:48.306 CEST [3120] LOG: selected new timeline ID: 4 2025-08-25 11:06:48.461 CEST [3120] LOG: restored log file "00000003.history" from archive 2025-08-25 11:06:48.487 CEST [3120] LOG: archive recovery complete 2025-08-25 11:06:49.671 CEST [3118] LOG: restartpoint complete: wrote 6161 buffers (37.6%); 0 WAL file(s) added, 0 removed, 1 recycled; write=1.025 s, sync=0.916 s, total=2.001 s; sync files=44, longest=0.645 s, average=0.021 s; distance=50765 kB, estimate=78424 kB; lsn=2/4E58E48, redo lsn=2/227A4F8 2025-08-25 11:06:49.671 CEST [3118] LOG: recovery restart point at 2/227A4F8 2025-08-25 11:06:49.671 CEST [3118] DETAIL: Last completed transaction was at log time 2025-08-25 10:31:45.384122+02. 2025-08-25 11:06:49.682 CEST [3118] LOG: checkpoint starting: end-of-recovery immediate wait 2025-08-25 11:06:50.385 CEST [3118] LOG: checkpoint complete: wrote 10155 buffers (62.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.600 s, sync=0.062 s, total=0.714 s; sync files=21, longest=0.013 s, average=0.003 s; distance=71622 kB, estimate=77744 kB; lsn=2/686BFF0, redo lsn=2/686BFF0 2025-08-25 11:06:50.428 CEST [3116] LOG: database system is ready to accept connections Command used for copying WAL and history files copied 000000030000000200000006.partial file and it was used in restore process but giving false reference to /tmp/receivewal/00000004.history.partial. Maybe better template for including partial files may be used. But as an Administrator I expect this %f parameter to cover all files required to restore for recovery_target_timeline = 'latest'. Especially that restore_command = 'cp /tmp/receivewal/%f %p' definition example is used in Postgres documentation as sufficient to get maximum possible restore of data. I found "restore_command = 'cp /tmp/receivewal/%f %p || cp /tmp/receivewal/%f.partial %p'" on page: https://www.cybertec-postgresql.com/en/never-lose-a-postgresql-transaction-with-pg_receivewal/#archive-recovery-and-partial-wal-segments published 10.2019, so the problem probably also exists in older Postgres versions.
pgsql-bugs by date: