Thread: Online Backups: Minor Caveat, Major Addition?
I just completed my first online backup and recovery cycle in a test environment. I encountered a minor hiccup at step 8 of the process outlined in the docs in 23.3.3, "Recovering with an On-line Backup": http://www.postgresql.org/docs/8.1/static/backup-online.html#BACKUP- PITR-RECOVERY A base backup taken from a running postmaster will still include a postmaster.pid file, which will prevent a new postmaster from being able to be started. As a quick fix, a note could be added to item 4 of the recovery process that reads something like the following: Note: If you are recovering to a new cluster and your base backup was taken from a running postmaster, you will need to remove postmaster.pid if it exists in order to start the new postmaster. In reviewing this section of the documentation, though, I'm wondering if it might not be more clearly broken into 3 further subsections. I would suggest: 23.3.3.1 - Inline (or In-place) Recovery 23.3.3.2 - Remote Recovery or Recovery into a New Cluster 23.3.3.3 - Continuous Recovery The "Inline Recovery" section would just be the existing 23.3.3 repurposed. The "Remote/New Cluster Recovery" section would be an edited version of the existing 23.3.3 to eliminate step1 and include the note suggested above. The "Continuous Recovery" section would include details of how to continuously apply WAL files to a separate cluster in order to have a true hot standby system. Thoughts? I'd be happy to draft 23.3.3.2. I'll have to figure out how to implement Simon Riggs's suggestion of a wait-for-files recover_command including a way to interrupt in the event of a need for actual failover-style recovery before I could draft 23.3.3.3, though. -- Thomas F. O'Connell Database Architecture and Programming Co-Founder Sitening, LLC http://www.sitening.com/ 3004 B Poston Avenue Nashville, TN 37203-1314 615-260-0005 (cell) 615-469-5150 (office) 615-469-5151 (fax)
"Thomas F. O'Connell" <tfo@sitening.com> writes: > A base backup taken from a running postmaster will still include a > postmaster.pid file, which will prevent a new postmaster from being > able to be started. Usually not; only if the PID mentioned in the file belongs to an existing process belonging to the postgres userid does Postgres believe that the pidfile is valid. It might be worth mentioning this as you suggest, but I think it's a sufficiently low-probability case that your failure was probably due to something else. regards, tom lane
On Mar 20, 2006, at 4:48 PM, Tom Lane wrote: > "Thomas F. O'Connell" <tfo@sitening.com> writes: >> A base backup taken from a running postmaster will still include a >> postmaster.pid file, which will prevent a new postmaster from being >> able to be started. > > Usually not; only if the PID mentioned in the file belongs to an > existing process belonging to the postgres userid does Postgres > believe > that the pidfile is valid. > > It might be worth mentioning this as you suggest, but I think it's a > sufficiently low-probability case that your failure was probably > due to > something else. My test scenario involved setting up a new cluster on the same machine as the base postgres I was attempting to recover. So you're probably right about the rarity. What about the larger suggested change of breaking that section into three more granular subsections? I could see commentary being slightly more helpful for each. -- Thomas F. O'Connell Database Architecture and Programming Co-Founder Sitening, LLC http://www.sitening.com/ 3004 B Poston Avenue Nashville, TN 37203-1314 615-260-0005 (cell) 615-469-5150 (office) 615-469-5151 (fax)
"Thomas F. O'Connell" <tfo@sitening.com> writes: > What about the larger suggested change of breaking that section into > three more granular subsections? I could see commentary being > slightly more helpful for each. No particular opinion from here. Someone (was it Scott Marlowe?) recently volunteered to draft a complete restructuring of the admin docs --- so it would probably be better to think about this as part of that effort rather than a standalone change. regards, tom lane
On Mon, 2006-03-20 at 17:12, Tom Lane wrote: > "Thomas F. O'Connell" <tfo@sitening.com> writes: > > What about the larger suggested change of breaking that section into > > three more granular subsections? I could see commentary being > > slightly more helpful for each. > > No particular opinion from here. > > Someone (was it Scott Marlowe?) recently volunteered to draft a > complete restructuring of the admin docs --- so it would probably be > better to think about this as part of that effort rather than a > standalone change. That was me, and I'm working on it right now, in the background. If anyone has any input, I'd be glad to hear it. I'm still getting docbooks setup on my laptop and such...
On Mon, Mar 20, 2006 at 09:54:53AM -0600, Thomas F. O'Connell wrote: > 23.3.3.1 - Inline (or In-place) Recovery > 23.3.3.2 - Remote Recovery or Recovery into a New Cluster > 23.3.3.3 - Continuous Recovery > > The "Inline Recovery" section would just be the existing 23.3.3 > repurposed. > > The "Remote/New Cluster Recovery" section would be an edited version > of the existing 23.3.3 to eliminate step1 and include the note > suggested above. > > The "Continuous Recovery" section would include details of how to > continuously apply WAL files to a separate cluster in order to have a > true hot standby system. > > Thoughts? > > I'd be happy to draft 23.3.3.2. I'll have to figure out how to > implement Simon Riggs's suggestion of a wait-for-files > recover_command including a way to interrupt in the event of a need > for actual failover-style recovery before I could draft 23.3.3.3, > though. BTW, when it comes to continuous recovery you should have a look at http://pgfoundry.org/projects/pgpitrha/ -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Mar 20, 2006, at 5:21 PM, Scott Marlowe wrote: > On Mon, 2006-03-20 at 17:12, Tom Lane wrote: >> "Thomas F. O'Connell" <tfo@sitening.com> writes: >>> What about the larger suggested change of breaking that section into >>> three more granular subsections? I could see commentary being >>> slightly more helpful for each. >> >> No particular opinion from here. >> >> Someone (was it Scott Marlowe?) recently volunteered to draft a >> complete restructuring of the admin docs --- so it would probably be >> better to think about this as part of that effort rather than a >> standalone change. > > That was me, and I'm working on it right now, in the background. > > If anyone has any input, I'd be glad to hear it. I'm still getting > docbooks setup on my laptop and such... Well, my input is to break 23.3.3 out into 3 practical scenarios: 1) recovery (same-server) 2) one-time (or occasional) remote backup 3) continuous recovery I just noticed another caveat for remote recovery: disable archive_command in the postgresql.conf from the filesystem-level backup. -- Thomas F. O'Connell Database Architecture and Programming Co-Founder Sitening, LLC http://www.sitening.com/ 3004 B Poston Avenue Nashville, TN 37203-1314 615-260-0005 (cell) 615-469-5150 (office) 615-469-5151 (fax)
On Mon, 2006-03-20 at 09:54 -0600, Thomas F. O'Connell wrote: > 23.3.3.1 - Inline (or In-place) Recovery > 23.3.3.2 - Remote Recovery or Recovery into a New Cluster > 23.3.3.3 - Continuous Recovery > > The "Inline Recovery" section would just be the existing 23.3.3 > repurposed. > > The "Remote/New Cluster Recovery" section would be an edited version > of the existing 23.3.3 to eliminate step1 and include the note > suggested above. > > The "Continuous Recovery" section would include details of how to > continuously apply WAL files to a separate cluster in order to have a > true hot standby system. > > Thoughts? I'd include these as specific cases of the general workflow/checklist. So, additions rather than refactoring. More words is better for most people I think. Best Regards, Simon Riggs