Re: Cannot rebuild a standby server - Mailing list pgsql-admin
From | Kevin Grittner |
---|---|
Subject | Re: Cannot rebuild a standby server |
Date | |
Msg-id | 1403296569.12353.YahooMailNeo@web122303.mail.ne1.yahoo.com Whole thread Raw |
In response to | Re: Cannot rebuild a standby server (John Scalia <jayknowsunix@gmail.com>) |
List | pgsql-admin |
John Scalia <jayknowsunix@gmail.com> wrote: > On 6/20/2014 2:09 PM, Kevin Grittner wrote: >> The documentation is your friend. It gives pretty specific >> instructions for what to do. > Well, I did finally get it working by adding -X s -c fast to the > pg_basebackup command. Hopefully you did not follow that with a copy of pg_xlog files from a running server. > Kevin, if I didn't copy WALs over, the database still refused to > start as it claimed it was looking for a one of the first > specific files. You absolutely need WAL files for recovery. You absolutely should not copy the files ad hoc from a running cluster, as they may be in flux at the time of the copy. The archive_command is called at the right point in time to get a good copy, and using the pg_basebackup command will also acquire them in a safe way. > Also, I've not seen any references to removing certain files like > a backup_label file in the standby's data directory causing > problems. On this page: http://www.postgresql.org/docs/current/static/continuous-archiving.html ... it says: | pg_start_backup creates a backup label file, called backup_label, | in the cluster directory with information about your backup, | including the start time and label string. The file is critical | to the integrity of the backup, should you need to restore from | it. The same page later says: | It's also worth noting that the pg_start_backup function makes a | file named backup_label in the database cluster directory, which | is removed by pg_stop_backup. This file will of course be | archived as a part of your backup dump file. The backup label | file includes the label string you gave to pg_start_backup, as | well as the time at which pg_start_backup was run, and the name | of the starting WAL file. In case of confusion it is therefore | possible to look inside a backup dump file and determine exactly | which backup session the dump file came from. However, this file | is not merely for your information; its presence and contents are | critical to the proper operation of the system's recovery | process. These are related to the pg_start_backup function and apply whether you run that function directly or use pg_basebackup, which runs it. > The other files I removed were the old postgresql.pid file from > the primary You should remove that, or exclude it from your backup in the first place. > and a file called archiving_active, which I use for controlling > whether postgresql writes WAL files or not. That's based on your particular archiving strategy; without more info on that I can't comment. > Seems a little funny to me that I've done this same procedure for > over 4 months with no problems, and today was the first time it > bit me. It's sorta like crossing a highway without looking either way. It might work many times. Then again, a Mac truck might be coming just that one time. If you would like to have it work consistently, rather than gamble each time, don't copy WAL files from the pg_xlog directory of a live server except through the documented methods, and don't delete the backup_label file. If you do either of those things you risks errors like you just saw. Worse, you risk not getting errors but having a silently corrupted database. Unless you enjoy the thrill of it.... ;-) -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-admin by date: