I'm trying to get a WAL recovery system set up so I have a hot-spare database server standing by should my first one fail. The idea is that every night, over night, the WAL logs for that day will be shipped from the main server to the standby, and the standby will replay them so it is up to date.
Every week a full backup will be taken of the live system, and stored off-site.
So far I've got it working so that:
- My full, base backup from yesterday has been loaded onto the spare - The WAL logs up to 2PM today have been shipped and replayed onto the spare - all OK to here
However, whenever I try to ship more logs and play them, I get the following error in the final file:
2006-02-22 15:50:00 GMT LOG: starting archive recovery 2006-02-22 15:50:00 GMT LOG: restore_command = "cp /mndata/archive/xlog_archive/%f %p" cp: cannot stat `/mndata/archive/xlog_archive/00000001.history': No such file or directory 2006-02-22 15:50:00 GMT LOG: restored log file "0000000100000000000000D9" from archive 2006-02-22 15:50:00 GMT LOG: invalid record length at 0/D9FFDB84 2006-02-22 15:50:00 GMT LOG: invalid primary checkpoint record 2006-02-22 15:50:00 GMT LOG: restored log file "0000000100000000000000D9" from archive 2006-02-22 15:50:00 GMT LOG: restored log file "0000000100000000000000DA" from archive 2006-02-22 15:50:00 GMT LOG: invalid resource manager ID in secondary checkpoint record 2006-02-22 15:50:00 GMT PANIC: could not locate a valid checkpoint record 2006-02-22 15:50:00 GMT LOG: startup process (PID 20792) was terminated by signal 6 2006-02-22 15:50:00 GMT LOG: aborting startup due to startup process failure 2006-02-22 15:50:00 GMT LOG: logger shutting down
However, if I delete my PG data directory, restore the same base backup from yesterday, and begin recovery, it recovers right up until the last log file, which the previous roll-forward attempt fails. The log files were fully archived off the live server to begin with so I can't see it's that they've changed or anything.
Is this scenario possible - that you can keep rolling forward over log files as long as necessary, or do you always have to start from a base backup? Nothing is changing on the spare, it's literally a sitting duck.