Thread: recovery via base + WAL replay failure
I'm trying to demonstrate recovery using the "continuous archiving" backup technique. I'm using 8.3 on Windows. I made a base backup while the postgres was running using the following batch file: -------------------------- psql -d test_database -U user_name -c "SELECT pg_start_backup('test');" 7za a -tzip "C:\backup\base.zip" "C:\Program Files\PostgreSQL\8.3\data" psql -d test_database -U user_name -c "SELECT pg_stop_backup();" -------------------------- No indication of any errors. However, I am unable to recover. Here are my steps: 1. Stop service, replace the data directory w/ the base backup. 2. Create recovery.conf with this setting: restore_command = 'copy C:\backup\%f "%p"' 3. Try to start service After about 90 seconds, the dos prompt displays: "(postgres 8.3) service could not be started" "The service did not report an error" Does anyone know what I am probably doing wrong? Thanks, --Rob Adams
Rob Adams wrote: > I'm trying to demonstrate recovery using the "continuous archiving" > backup technique. I'm using 8.3 on Windows. > > I made a base backup while the postgres was running using the following > batch file: > > -------------------------- > psql -d test_database -U user_name -c "SELECT pg_start_backup('test');" > > 7za a -tzip "C:\backup\base.zip" "C:\Program Files\PostgreSQL\8.3\data" > > psql -d test_database -U user_name -c "SELECT pg_stop_backup();" > -------------------------- > > No indication of any errors. However, I am unable to recover. Here are > my steps: > > 1. Stop service, replace the data directory w/ the base backup. > 2. Create recovery.conf with this setting: > restore_command = 'copy C:\backup\%f "%p"' > 3. Try to start service > > After about 90 seconds, the dos prompt displays: > "(postgres 8.3) service could not be started" > "The service did not report an error" > > Does anyone know what I am probably doing wrong? You'll need to look at the PostgreSQL logs to see what they say. But I'm curious about the 7za step. It gives no errors or warnings? IIRC, you need to use VSS on Windows to avoid sharing violations when opening the files unless a specific open mode is used. It could be that 7za uses that by default, but it's also possible tha tit's not actually backing up all files... //Magnus
On Sun, 3 Aug 2008, Rob Adams wrote: > I made a base backup while the postgres was running using the following batch > file: > psql -d test_database -U user_name -c "SELECT pg_start_backup('test');" What did you have archive_command set to? That needs to dump the WAL files generated while the backup is going on somewhere that gets copied over after the main copy is done, and you need the last of them referenced by the backup copied over before you can use that backup. Steps (1) and (5) of http://www.postgresql.org/docs/current/static/continuous-archiving.html are the hard parts here and I don't see that you're addressing them so far, and that will keep the copy from starting if all the files aren't there. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
what error show the log file? --- On Mon, 8/4/08, Greg Smith <gsmith@gregsmith.com> wrote: > From: Greg Smith <gsmith@gregsmith.com> > Subject: Re: [GENERAL] recovery via base + WAL replay failure > To: "Rob Adams" <robfadams@cox.net> > Cc: "postgres general" <pgsql-general@postgresql.org> > Date: Monday, August 4, 2008, 5:58 PM > On Sun, 3 Aug 2008, Rob Adams wrote: > > > I made a base backup while the postgres was running > using the following batch > > file: > > psql -d test_database -U user_name -c "SELECT > pg_start_backup('test');" > > What did you have archive_command set to? That needs to > dump the WAL > files generated while the backup is going on somewhere that > gets copied > over after the main copy is done, and you need the last of > them referenced > by the backup copied over before you can use that backup. > Steps (1) and > (5) of > http://www.postgresql.org/docs/current/static/continuous-archiving.html > > are the hard parts here and I don't see that you're > addressing them so > far, and that will keep the copy from starting if all the > files aren't > there. > > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com > Baltimore, MD > > -- > Sent via pgsql-general mailing list > (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
The WAL file archiving appears to be working correctly. These are the settings I'm using for archiving the WAL files: archive_mode = on archive_command = 'copy %p C:\backup\%f /A' archive_timeout = 15s Thanks again, --Rob Greg Smith wrote: > On Sun, 3 Aug 2008, Rob Adams wrote: > >> I made a base backup while the postgres was running using the >> following batch file: >> psql -d test_database -U user_name -c "SELECT pg_start_backup('test');" > > What did you have archive_command set to? That needs to dump the WAL > files generated while the backup is going on somewhere that gets copied > over after the main copy is done, and you need the last of them > referenced by the backup copied over before you can use that backup. > Steps (1) and (5) of > http://www.postgresql.org/docs/current/static/continuous-archiving.html > are the hard parts here and I don't see that you're addressing them so > far, and that will keep the copy from starting if all the files aren't > there. > > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD >
There is nothing in the log file (in pg_log dir) with regard to this. Should I set any particular parameter in the postgresql.conf file to log information about a failed startup? I have not altered or uncommented any lines in the "ERROR REPORTING AND LOGGING" section of the conf file. Thanks again, --Rob Adams Lennin Caro wrote: > what error show the log file? > > > --- On Mon, 8/4/08, Greg Smith <gsmith@gregsmith.com> wrote: > >> From: Greg Smith <gsmith@gregsmith.com> >> Subject: Re: [GENERAL] recovery via base + WAL replay failure >> To: "Rob Adams" <robfadams@cox.net> >> Cc: "postgres general" <pgsql-general@postgresql.org> >> Date: Monday, August 4, 2008, 5:58 PM >> On Sun, 3 Aug 2008, Rob Adams wrote: >> >>> I made a base backup while the postgres was running >> using the following batch >>> file: >>> psql -d test_database -U user_name -c "SELECT >> pg_start_backup('test');" >> >> What did you have archive_command set to? That needs to >> dump the WAL >> files generated while the backup is going on somewhere that >> gets copied >> over after the main copy is done, and you need the last of >> them referenced >> by the backup copied over before you can use that backup. >> Steps (1) and >> (5) of >> http://www.postgresql.org/docs/current/static/continuous-archiving.html >> >> are the hard parts here and I don't see that you're >> addressing them so >> far, and that will keep the copy from starting if all the >> files aren't >> there. >> >> -- >> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com >> Baltimore, MD >> >> -- >> Sent via pgsql-general mailing list >> (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > > > >
I found error log entries in the Windows Event Viewer: ---------------- 2008-08-01 23:57:55 GMT FATAL: could not remove old lock file "postmaster.pid": Permission denied 2008-08-01 23:57:55 GMT HINT: The file seems accidentally left over, but it could not be removed. Please remove the file by hand and try again. ---------------- However, there is no postmaster.pid file in the data directory. (I can't find one anywhere else, either.) Any ideas? Thanks, --Rob Adams Lennin Caro wrote: > what error show the log file? > > > --- On Mon, 8/4/08, Greg Smith <gsmith@gregsmith.com> wrote: > >> From: Greg Smith <gsmith@gregsmith.com> >> Subject: Re: [GENERAL] recovery via base + WAL replay failure >> To: "Rob Adams" <robfadams@cox.net> >> Cc: "postgres general" <pgsql-general@postgresql.org> >> Date: Monday, August 4, 2008, 5:58 PM >> On Sun, 3 Aug 2008, Rob Adams wrote: >> >>> I made a base backup while the postgres was running >> using the following batch >>> file: >>> psql -d test_database -U user_name -c "SELECT >> pg_start_backup('test');" >> >> What did you have archive_command set to? That needs to >> dump the WAL >> files generated while the backup is going on somewhere that >> gets copied >> over after the main copy is done, and you need the last of >> them referenced >> by the backup copied over before you can use that backup. >> Steps (1) and >> (5) of >> http://www.postgresql.org/docs/current/static/continuous-archiving.html >> >> are the hard parts here and I don't see that you're >> addressing them so >> far, and that will keep the copy from starting if all the >> files aren't >> there. >> >> -- >> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com >> Baltimore, MD >> >> -- >> Sent via pgsql-general mailing list >> (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > > > >
Finally figured out what was wrong. The data folder had incorrect permissions after unzipping the base backup. For me, the solution was unchecking the "Inherit from parent the permission entries that apply to child objects" option in the Advanced Security Settings dialog for the data folder & giving the postgres user full control. Nothing appeared in the log when the database failed to startup b/c the permissions were wrong. However, an application error did get reported to Windows which I found using the Event Viewer. --Rob