Brian Wipf <brian@clickspace.com> writes:
> Last night, I brought the database out of its perpetual recovery
> mode. Here are the lines from the log when this was done:
> [2007-10-01 23:43:03 MDT] LOG: restored log file
> "000000010000046600000060" from archive
> [2007-10-01 23:45:50 MDT] LOG: could not open file "pg_xlog/
> 000000010000046600000061" (log file 1126, segment 97): No such file
> or directory
> [2007-10-01 23:45:50 MDT] LOG: redo done at 466/60000070
> Which is all fine, since 000000010000046600000060.gz was the last
> archived WAL file. The next entry in the log follows:
> [2007-10-01 23:45:50 MDT] PANIC: could not open file "pg_xlog/
> 000000010000046600000060" (log file 1126, segment 96): No such file
> or directory
> [2007-10-01 23:45:51 MDT] LOG: startup process (PID 27624) was
> terminated by signal 6
> [2007-10-01 23:45:51 MDT] LOG: aborting startup due to startup
> process failure
> [2007-10-01 23:45:51 MDT] LOG: logger shutting down
> And the database would not start up. The issue appears to be that the
> restore_command script itself ungzips the WAL to its destination %p,
> and the WAL is left in the archive directory as
> 000000010000046600000060.gz. By simply ungzipping the last few WALs
> manually in the archive directory, the database replayed them and
> started up successfully.
What this sounds like to me is a problem in your recovery procedures.
What exactly did you do to "bring the database out of recovery mode"?
It is expected that it would ask the restore_command script to fetch the
last WAL segment a second time, and I don't understand why that didn't
Just Work.
regards, tom lane