Thread: BUG #13977: Strange behavior with WAL archive recovery
The following bug has been logged on the website: Bug reference: 13977 Logged by: Sergey Burladyan Email address: eshkinkot@gmail.com PostgreSQL version: 9.4.6 Operating system: Debian GNU/Linux 8.2 (jessie) Description: Hello I see in postgresql.log what server constantly receives the same file from WAL archive. I does not see this in 9.2 version. recovery.conf: restore_command = 'restore_cmd /mnt/WALs %f %p' recovery_target_timeline = 'latest' standby_mode = on 2016-02-22 17:23:15.820 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C1" from archive 2016-02-22 17:23:16.323 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C2" from archive 2016-02-22 17:23:16.639 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C3" from archive 2016-02-22 17:23:16.983 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C4" from archive 2016-02-22 17:23:17.119 MSK pid=19436,user=,db=,host= LOG: unexpected pageaddr 74A4/C8000000 in log segment 00000001000074A7000000C5, offset 0 2016-02-22 17:23:20.279 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C4" from archive 2016-02-22 17:23:20.473 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C5" from archive 2016-02-22 17:23:20.809 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C6" from archive 2016-02-22 17:23:21.189 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:23:21.361 MSK pid=19436,user=,db=,host= LOG: unexpected pageaddr 74A4/D5000000 in log segment 00000001000074A7000000C8, offset 0 2016-02-22 17:23:25.527 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:23:30.695 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:23:35.864 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:23:41.033 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:23:45.200 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:23:50.374 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:23:55.546 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive 2016-02-22 17:24:00.717 MSK pid=19436,user=,db=,host= LOG: restored log file "00000001000074A7000000C7" from archive
Hi, > On 22 Feb 2016, at 15:31, eshkinkot@gmail.com wrote: >=20 > The following bug has been logged on the website: >=20 > Bug reference: 13977 > Logged by: Sergey Burladyan > Email address: eshkinkot@gmail.com > PostgreSQL version: 9.4.6 > Operating system: Debian GNU/Linux 8.2 (jessie) > Description: =20 >=20 > Hello >=20 > I see in postgresql.log what server constantly receives the same file = from > WAL archive. I does not see this in 9.2 version. >=20 > recovery.conf: > restore_command =3D 'restore_cmd /mnt/WALs %f %p' > recovery_target_timeline =3D 'latest' > standby_mode =3D on >=20 > 2016-02-22 17:23:15.820 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C1" from archive > 2016-02-22 17:23:16.323 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C2" from archive > 2016-02-22 17:23:16.639 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C3" from archive > 2016-02-22 17:23:16.983 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C4" from archive > 2016-02-22 17:23:17.119 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = unexpected > pageaddr 74A4/C8000000 in log segment 00000001000074A7000000C5, offset = 0 > 2016-02-22 17:23:20.279 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C4" from archive > 2016-02-22 17:23:20.473 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C5" from archive > 2016-02-22 17:23:20.809 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C6" from archive > 2016-02-22 17:23:21.189 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:23:21.361 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = unexpected > pageaddr 74A4/D5000000 in log segment 00000001000074A7000000C8, offset = 0 > 2016-02-22 17:23:25.527 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:23:30.695 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:23:35.864 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:23:41.033 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:23:45.200 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:23:50.374 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:23:55.546 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive > 2016-02-22 17:24:00.717 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: = restored log file > "00000001000074A7000000C7" from archive I think it=E2=80=99s expected. PostgreSQL reports the latest file it has = restored, meaning that once it restored 00000001000074A7000000C7 it was = not able to get the following one (presumably, 00000001000074A7000000C8, = unless there was a timeline change).=20 If there is a 00000001000074A7000000C8 in the archive, then perhaps = there is something wrong with it (check the md5). Sometimes turning on = log_min_messages=3D=E2=80=98debug=E2=80=99 helps finding out the reason = why PostgreSQL is unable to restore the given segment. -- Oleksii
On Thu, Feb 25, 2016 at 12:54 AM, Oleksii Kliukin <alexk@hintbits.com> wrot= e: >> On 22 Feb 2016, at 15:31, eshkinkot@gmail.com wrote: >> 2016-02-22 17:23:16.983 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: resto= red log file >> "00000001000074A7000000C4" from archive >> 2016-02-22 17:23:17.119 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: unexp= ected >> pageaddr 74A4/C8000000 in log segment 00000001000074A7000000C5, offset 0 [...] >> 2016-02-22 17:23:21.189 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: resto= red log file >> "00000001000074A7000000C7" from archive >> 2016-02-22 17:23:21.361 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: unexp= ected >> pageaddr 74A4/D5000000 in log segment 00000001000074A7000000C8, offset 0 Those two entries are weird. A segment has been restored, but then a complain comes from segment+1 in the error context given here. Looking at xlogreader.c there is no one-off kind of problems in the reader machinery. >> 2016-02-22 17:23:25.527 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: resto= red log file >> "00000001000074A7000000C7" from archive >> [...] >> 2016-02-22 17:23:30.695 MSK pid=3D19436,user=3D,db=3D,host=3D LOG: resto= red log file >> "00000001000074A7000000C7" from archive > > I think it=E2=80=99s expected. PostgreSQL reports the latest file it has = restored, meaning that once it restored 00000001000074A7000000C7 it was not= able to get the following one (presumably, 00000001000074A7000000C8, unles= s there was a timeline change). This error message is reported by Postgres once a segment is being correctly restored in pg_xlog and that it is found to have a correct size. What is actually restore_cmd doing? --=20 Michael