Thread: pitr errors
I am testing PITR, following the instructions in: "23.3.3. Recovering with
an On-line Backup." In step 9 I inspect the database and find that it is
working perfectly. All the data from the original is present and I am able
to create a new table and insert rows. The problem is that there are errors
in the log as follows:
########################################
...
2009-09-23 13:32:32 PDT::@[11675]:LOG: restored log file
"00000003000000EF000000A5" from archive
cp: cannot stat
`/home/postgres/data/pnp/pgsql8.3.3/mnt/data/pns/pgsql8.3.3/archive_log/00000003000000EF000000A6':
No such file or directory
2009-09-23 13:32:32 PDT::@[11675]:LOG: record with zero length at
EF/A6000560
2009-09-23 13:32:32 PDT::@[11675]:LOG: redo done at EF/A6000530
2009-09-23 13:32:32 PDT::@[11675]:LOG: last completed transaction was at
log time 2009-09-23 13:32:51.364195-07
cp: cannot stat
`/home/postgres/data/pnp/pgsql8.3.3/mnt/data/pns/pgsql8.3.3/archive_log/00000003000000EF000000A6':
No such file or directory
cp: cannot stat
`/home/postgres/data/pnp/pgsql8.3.3/mnt/data/pns/pgsql8.3.3/archive_log/00000004.history':
No such file or directory
2009-09-23 13:32:32 PDT::@[11675]:LOG: selected new timeline ID: 4
2009-09-23 13:32:32 PDT::@[11675]:LOG: restored log file "00000003.history"
from archive
2009-09-23 13:32:32 PDT::@[11675]:LOG: archive recovery complete
2009-09-23 13:32:32 PDT::@[11675]:LOG: checkpoint starting: shutdown
immediate
2009-09-23 13:32:33 PDT::@[11675]:LOG: checkpoint complete: wrote 824
buffers (2.5%); 0 transaction log file(s) added, 0 removed, 0 recycled;
write=0.010 s, sync=0.336 s, total=0.356 s
2009-09-23 13:32:33 PDT::@[11759]:LOG: autovacuum launcher started
2009-09-23 13:32:33 PDT::@[11673]:LOG: database system is ready to accept
connections
########################################
When I check the WAL file directories I find that all is in order. The file
00000004.history does not exist, but it is not supposed to exist. The file
00000003000000EF000000A6 is not in the backup archive (archive_log) because
it had not been moved there yet at the time I did the pitr. It is in the
current archive (pg_xlog), where it should be. Here is a listing of those
directories:
########################################
> ls archive_log/
...
00000003000000A100000024.0001F538.backup
00000003000000EE00000097 00000003000000EF000000A2
00000003000000A1000000B5.0001B730.backup
00000003000000EE00000098 00000003000000EF000000A3
00000003000000A200000047.0001EE70.backup
00000003000000EE00000099 00000003000000EF000000A4
00000003000000A2000000D8.0001B938.backup
00000003000000EE0000009A 00000003000000EF000000A5
00000003000000A30000006A.00020230.backup
00000003000000EE0000009B 00000003.history
ls ../archive_log/*history
../archive_log/00000002.history ../archive_log/00000003.history
> ls ../pg_xlog/
00000002.history 00000003000000EF000000A6
00000003000000EF000000A9 00000003000000EF000000AC
00000003000000EF0000005D.000000D8.backup 00000003000000EF000000A7
00000003000000EF000000AA 00000003.history
00000003000000EF000000A5 00000003000000EF000000A8
00000003000000EF000000AB archive_status
> ls ../pg_xlog/*
../pg_xlog/00000002.history
../pg_xlog/00000003000000EF000000A7 ../pg_xlog/00000003000000EF000000AB
../pg_xlog/00000003000000EF0000005D.000000D8.backup
../pg_xlog/00000003000000EF000000A8 ../pg_xlog/00000003000000EF000000AC
../pg_xlog/00000003000000EF000000A5
../pg_xlog/00000003000000EF000000A9 ../pg_xlog/00000003.history
../pg_xlog/00000003000000EF000000A6
../pg_xlog/00000003000000EF000000AA
../pg_xlog/archive_status:
00000002.history.done 00000003000000EF0000005D.000000D8.backup.done
00000003000000EF000000A5.done 00000003.history.done
########################################
So, my question is, can I ignore these error messages and assume that the
pitr worked and all is fine?
Lou ...
Thanks,
Lou ... x69925
On Tue, 29 Sep 2009, Louis Fridkis wrote: > When I check the WAL file directories I find that all is in order. The file > 00000004.history does not exist, but it is not supposed to exist. The file > 00000003000000EF000000A6 is not in the backup archive (archive_log) because > it had not been moved there yet at the time I did the pitr. If you look at the current docs, it addresses all this. Your message suggests you might be looking at the docs from an earlier version but I wasn't sure which, this might not have been as clear then. http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html "It is important that the [restore] command return nonzero exit status on failure. The command will be asked for files that are not present in the archive; it must return nonzero when so asked. This is not an error condition. Not all of the requested files will be WAL segment files; you should also expect requests for files with a suffix of .backup or .history...Normally, recovery will proceed through all available WAL segments, thereby restoring the database to the current point in time (or as close as we can get given the available WAL segments). So a normal recovery will end with a "file not found" message, the exact text of the error message depending upon your choice of restore_command. You may also see an error message at the start of recovery for a file named something like 00000001.history. This is also normal and does not indicate a problem in simple recovery situations. See Section 24.3.4 for discussion." Are you using pg_standby for your restore_command? If not, you probably should be. Not sure if all of the error messages you showed would go away if you switched to it, but the specific "cp" errors you showed suggest you're using the simple restore_command example from the manual there, which is really not a production quality solution there. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD