Re: Point-in-time recovery after failover - Mailing list pgsql-general

From Laurenz Albe
Subject Re: Point-in-time recovery after failover
Date
Msg-id 1520941901.2434.3.camel@cybertec.at
Whole thread Raw
In response to Point-in-time recovery after failover  (Dylan Luong <Dylan.Luong@unisa.edu.au>)
List pgsql-general
Dylan Luong wrote: 
> We are on Postgres 9.6 and we have primary/standby wal replication setup for HA.
>  
> I am trying to perform a point-in-time recovery after a failover has occurred.
>  
> I extracted the base backups (tar files) to the data directory and extracted the all the archive backups  (P1 and S1)
tothe archive directory.
 
> Followed the steps from
> Created recovery.conf with the following, but the recovery only recovers, to the point before the failover and comes
online.
>  
> Example of  my recovery.conf :
> restore_command = 'cp /pg_backup/backup/archive/%f %p'
> recovery_target_time = '2018-03-13 13:54:00'
> recovery_target_inclusive = 'false'
> recovery_target_timeline = '6'      (I also tried revovery_target_timeline = ‘lastest)
>  
> Is it possible to recovery to PITR after a failover using base backup from before failover and both archive backups
fromP1 and S1?
 
>  
>  
> Here is the log output :
>  
> 018-03-13 20:46:53 ACDT [154912]: [1-1] db=,user= app=,host= LOG:  database system was interrupted; last known up at
2018-03-1
> 3 01:31:25 ACDT
> 2018-03-13 20:46:53 ACDT [154912]: [2-1] db=,user= app=,host= LOG:  restored log file "00000006.history" from
archive
> cp: cannot stat ‘/pg_backup/backup/archive/00000007.history’: No such file or directory
> 2018-03-13 20:46:53 ACDT [154912]: [3-1] db=,user= app=,host= LOG:  starting point-in-time recovery to 2018-03-13
13:54:00+10:3
> 0
> 2018-03-13 20:46:53 ACDT [154912]: [4-1] db=,user= app=,host= LOG:  restored log file "00000006.history" from
archive
> cp: cannot stat ‘/pg_backup/backup/archive/000000060000011100000087’: No such file or directory
> 2018-03-13 20:46:53 ACDT [154912]: [5-1] db=,user= app=,host= LOG:  restored log file "000000050000011100000087" from
archive
> cp: cannot stat ‘/pg_backup/backup/archive/00000005.history’: No such file or directory
> 2018-03-13 20:46:53 ACDT [154912]: [6-1] db=,user= app=,host= LOG:  redo starts at 111/87000028
> 2018-03-13 20:46:53 ACDT [154912]: [7-1] db=,user= app=,host= LOG:  consistent recovery state reached at
111/870B4418
> 2018-03-13 20:46:53 ACDT [154908]: [3-1] db=,user= app=,host= LOG:  database system is ready to accept read only
connections
> cp: cannot stat ‘/pg_backup/backup/archive/000000060000011100000088’: No such file or directory
> 2018-03-13 20:46:53 ACDT [154912]: [8-1] db=,user= app=,host= LOG:  restored log file "000000050000011100000088" from
archive
> cp: cannot stat ‘/pg_backup/backup/archive/000000060000011100000089’: No such file or directory
> 2018-03-13 20:46:53 ACDT [154912]: [9-1] db=,user= app=,host= LOG:  restored log file "000000050000011100000089" from
archive
> cp: cannot stat ‘/pg_backup/backup/archive/00000006000001110000008A’: No such file or directory
> 2018-03-13 20:46:53 ACDT [154912]: [10-1] db=,user= app=,host= LOG:  restored log file "00000005000001110000008A"
fromarchive
 
[...]
> cp: cannot stat ‘/pg_backup/backup/archive/0000000600000111000000AD’: No such file or directory
> 2018-03-13 20:46:57 ACDT [154912]: [45-1] db=,user= app=,host= LOG:  restored log file "0000000500000111000000AD"
fromarchive
 
> cp: cannot stat ‘/pg_backup/backup/archive/0000000600000111000000AE’: No such file or directory
> cp: cannot stat ‘/pg_backup/backup/archive/0000000500000111000000AE’: No such file or directory
> 2018-03-13 20:46:57 ACDT [154912]: [46-1] db=,user= app=,host= LOG:  redo done at 111/ADFFE160
> 2018-03-13 20:46:57 ACDT [154912]: [47-1] db=,user= app=,host= LOG:  last completed transaction was at log time
2018-03-1313:4
 
> 0:02.36347+10:30
> 
>  
> Here is what is in the archive directory after extracting from the archive backups and after database comes online:
>  
[...]
> -rw-------. 1 postgres postgres 16777216 Mar 13 13:40 0000000500000111000000AA
> -rw-------. 1 postgres postgres 16777216 Mar 13 13:40 0000000500000111000000AB
> -rw-------. 1 postgres postgres 16777216 Mar 13 13:40 0000000500000111000000AC
> -rw-------. 1 postgres postgres 16777216 Mar 13 13:40 0000000500000111000000AD
> -rw-------. 1 postgres postgres 16777216 Mar 13 13:48 0000000500000111000000B1.partial
> -rw-------. 1 postgres postgres 16777216 Mar 13 13:49 0000000600000111000000B1
> -rw-------. 1 postgres postgres 16777216 Mar 13 14:04 0000000600000111000000B2
> -rw-------. 1 postgres postgres      221 Mar 13 13:48 00000006.history
> -rw-------. 1 postgres postgres 16777216 Mar 13 20:48 0000000700000111000000AD
> -rw-------. 1 postgres postgres      271 Mar 13 20:46 00000007.history

You are missing 0000000500000111000000AE or 0000000600000111000000AE.

One of these files should be there.

I suspect that there is a problem with your WAL archiving in connection with failover.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com


pgsql-general by date:

Previous
From: Dylan Luong
Date:
Subject: Point-in-time recovery after failover
Next
From: Melvin Davidson
Date:
Subject: Re: Programmatically duplicating a schema