BUG #15589: Due to missing wal,restore ends prematurely and opens database for read/write - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #15589: Due to missing wal,restore ends prematurely and opens database for read/write
Date
Msg-id 15589-8e6e353a97b085fb@postgresql.org
Whole thread Raw
Responses Re: BUG #15589: Due to missing wal, restore ends prematurely andopens database for read/write
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      15589
Logged by:          Leif Gunnar Erlandsen
Email address:      leif@lako.no
PostgreSQL version: 11.1
Operating system:   Red Hat Enterprise Linux Server release 7.6
Description:

recovery.conf was specified as
restore_command = 'cp /db/disk1/restore/archivelogs/archivelogs/%f %p'
recovery_target_time = '2019-01-03 13:00:00'

Due to some missing wal-files restore ended.

2019-01-10 12:05:50 CET [68417]: [67-1] user=,db=,app=,client= LOG:
restored log file "0000000500000158000000FF" from archive
cp: cannot stat
'/db/disk1/restore/archivelogs/archivelogs/000000050000015900000000': No
such file or directory
2019-01-10 12:05:50 CET [68417]: [68-1] user=,db=,app=,client= LOG:  redo
done at 158/FFFFFFB8
2019-01-10 12:05:50 CET [68417]: [69-1] user=,db=,app=,client= LOG:  last
completed transaction was at log time 2019-01-03 06:34:45.935752+01
2019-01-10 12:05:50 CET [68417]: [70-1] user=,db=,app=,client= LOG:
restored log file "0000000500000158000000FF" from archive
cp: cannot stat
'/db/disk1/restore/archivelogs/archivelogs/00000006.history': No such file
or directory
2019-01-10 12:05:50 CET [68417]: [71-1] user=,db=,app=,client= LOG:
selected new timeline ID: 6
2019-01-10 12:05:50 CET [68417]: [72-1] user=,db=,app=,client= LOG:  archive
recovery complete
cp: cannot stat
'/db/disk1/restore/archivelogs/archivelogs/00000005.history': No such file
or directory
2019-01-10 12:05:51 CET [68420]: [2-1] user=,db=,app=,client= LOG:
restartpoint complete: wrote 44395 buffers (5.4%); 1 WAL file(s) added, 0
removed, 0 recycled; write=6.310 s, sync=0.268 s, total=6.631 s; sync
files=178, longest=0.072 s, average=0.001 s; distance=64019 kB,
estimate=64019 kB
2019-01-10 12:05:51 CET [68420]: [3-1] user=,db=,app=,client= LOG:  recovery
restart point at 158/C4E84F98
2019-01-10 12:05:51 CET [68420]: [4-1] user=,db=,app=,client= DETAIL:  Last
completed transaction was at log time 2019-01-03 06:34:45.935752+01.
2019-01-10 12:05:51 CET [68420]: [5-1] user=,db=,app=,client= LOG:
checkpoints are occurring too frequently (7 seconds apart)
2019-01-10 12:05:51 CET [68420]: [6-1] user=,db=,app=,client= HINT:
Consider increasing the configuration parameter "max_wal_size".
2019-01-10 12:05:51 CET [68420]: [7-1] user=,db=,app=,client= LOG:
checkpoint starting: end-of-recovery immediate wait xlog
2019-01-10 12:05:51 CET [68420]: [8-1] user=,db=,app=,client= LOG:
checkpoint complete: wrote 18678 buffers (2.3%); 0 WAL file(s) added, 0
removed, 0 recycled; write=0.251 s, sync=0.006 s, total=0.312 s; sync
files=149, longest=0.006 s, average=0.000 s; distance=968172 kB,
estimate=968172 kB
2019-01-10 12:05:51 CET [68415]: [8-1] user=,db=,app=,client= LOG:  database
system is ready to accept connections


I found the missing wal-files and performed restore again from the start.
Then recovery paused when it was at correct time.

2019-01-10 13:46:28 CET [77004]: [3334-1] user=,db=,app=,client= LOG:
restored log file "0000000500000165000000C2" from archive
2019-01-10 13:46:28 CET [77007]: [318-1] user=,db=,app=,client= LOG:
restartpoint complete: wrote 87591 buffers (10.7%); 0 WAL file(s) added, 22
removed, 20 recycled; write=3.049 s, sync=0.001 s, total=3.192 s; sync
files=143, longest=0.001 s, average=0.000 s; distance=688531 kB,
estimate=689818 kB
2019-01-10 13:46:28 CET [77007]: [319-1] user=,db=,app=,client= LOG:
recovery restart point at 165/9706C358
2019-01-10 13:46:28 CET [77007]: [320-1] user=,db=,app=,client= DETAIL:
Last completed transaction was at log time 2019-01-03 12:13:22.014815+01.
2019-01-10 13:46:28 CET [77007]: [321-1] user=,db=,app=,client= LOG:
restartpoint starting: xlog
2019-01-10 13:46:29 CET [77004]: [3335-1] user=,db=,app=,client= LOG:
restored log file "0000000500000165000000C3" from archive
2019-01-10 13:46:29 CET [77004]: [3336-1] user=,db=,app=,client= LOG:
recovery stopping before commit of transaction 3316604, time 2019-01-03
13:00:01.563953+01
2019-01-10 13:46:29 CET [77004]: [3337-1] user=,db=,app=,client= LOG:
recovery has paused
2019-01-10 13:46:29 CET [77004]: [3338-1] user=,db=,app=,client= HINT:
Execute pg_wal_replay_resume() to continue.


PostgreSQL should have paused recovery also on the first scenario. Then I
could have added missing wal and continued with restore.


pgsql-bugs by date:

Previous
From: Thomas Munro
Date:
Subject: Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting inparallel query
Next
From: Jeff Janes
Date:
Subject: Re: BUG #15589: Due to missing wal, restore ends prematurely andopens database for read/write