WAL recovery - Mailing list pgsql-admin

From Andy Shellam
Subject WAL recovery
Date
Msg-id 43FC90C9.3090204@mailnetwork.co.uk
Whole thread Raw
Responses Re: WAL recovery
Re: WAL recovery
List pgsql-admin
Hi,

I'm trying to get a WAL recovery system set up so I have a hot-spare database server standing by should my first one fail.
The idea is that every night, over night, the WAL logs for that day will be shipped from the main server to the standby, and the standby will replay them so it is up to date.

Every week a full backup will be taken of the live system, and stored off-site.

So far I've got it working so that:

- My full, base backup from yesterday has been loaded onto the spare
- The WAL logs up to 2PM today have been shipped and replayed onto the spare - all OK to here

However, whenever I try to ship more logs and play them, I get the following error in the final file:

2006-02-22 15:50:00 GMT LOG:  starting archive recovery
2006-02-22 15:50:00 GMT LOG:  restore_command = "cp /mndata/archive/xlog_archive/%f %p"
cp: cannot stat `/mndata/archive/xlog_archive/00000001.history': No such file or directory
2006-02-22 15:50:00 GMT LOG:  restored log file "0000000100000000000000D9" from archive
2006-02-22 15:50:00 GMT LOG:  invalid record length at 0/D9FFDB84
2006-02-22 15:50:00 GMT LOG:  invalid primary checkpoint record
2006-02-22 15:50:00 GMT LOG:  restored log file "0000000100000000000000D9" from archive
2006-02-22 15:50:00 GMT LOG:  restored log file "0000000100000000000000DA" from archive
2006-02-22 15:50:00 GMT LOG:  invalid resource manager ID in secondary checkpoint record
2006-02-22 15:50:00 GMT PANIC:  could not locate a valid checkpoint record
2006-02-22 15:50:00 GMT LOG:  startup process (PID 20792) was terminated by signal 6
2006-02-22 15:50:00 GMT LOG:  aborting startup due to startup process failure
2006-02-22 15:50:00 GMT LOG:  logger shutting down

However, if I delete my PG data directory, restore the same base backup from yesterday, and begin recovery, it recovers right up until the last log file, which the previous roll-forward attempt fails.
The log files were fully archived off the live server to begin with so I can't see it's that they've changed or anything.

Is this scenario possible - that you can keep rolling forward over log files as long as necessary, or do you always have to start from a base backup?  Nothing is changing on the spare, it's literally a sitting duck.

Thanks

Andy

pgsql-admin by date:

Previous
From: Tom Lane
Date:
Subject: Re: WARNING: foreign key constraint will require costly sequential scans during pg_restore
Next
From: "Christian Sengstock"
Date:
Subject: broken restore.sql script !?