Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files - Mailing list pgsql-general

From Tommy Gildseth
Subject Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files
Date
Msg-id 48DB4EE1.8090900@usit.uio.no
Whole thread Raw
Responses Re: Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files  (Tommy Gildseth <tommy.gildseth@usit.uio.no>)
Re: Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I've recently been testing our backup/restore procedures, and discovered
a minor inconvenience.

I emptied out the data directory(on a test-box), and restored it from a
backup. I made sure that pg_xlog and pg_xlog/archive_status was empty.
I then set up the recovery.conf file in the root of the data directory,
and pointed restore_command in the direction of the archived wal files.

After starting postgresql, the cluster started restoring from the
archive logs, and everything looks fine, and doing what it should. The
problem came at the end of the recovery, after the log line:

[2008-09-23 15:33:14.764 CEST] [pgtest01] [:] [] [18393] [] LOG: archive
recovery complete

followed immediately after by this line:

  [2008-09-24 13:04:52.168 CEST] pgtest01] [:] [] [13324] [] PANIC:
could not open file "pg_xlog/00000002000000030000009F" (log file 3,
segment 159): Permission denied

and, the cluster shutting down.
After a bit of looking around, and with some help from the fine people
in #postgresql on freenode, I think I figured out what was going on.
The last wal archive file was 00000001000000030000009F, and after
finishing recovery, postgresql created the file 00000002000000030000009F
(ie. 00000002 instead of 00000001) in pg_xlog. The wal-files were
archived read-only, and this file permission seemed to be carried over
to the new file created by postgresql in pg_xlog, causing the cluster to
fall over and die. Changing permissions on the last wal-file in the
archive directory to read/write made this problem go away. Moving the
last wal file from the archive folder to pg_xlog/ or, if I had partial
archive log files in pg_xlog, also made this problem go away,
irrespective of whether the file was chmod read only, or read/write.
Simply chmoding the wal-file in pg_xlog/, and trying to restart the
cluster did not work, and I found no other way to recover from this,
other than to start over again from the beginning.


--
Tommy Gildseth

pgsql-general by date:

Previous
From: Tomasz Ostrowski
Date:
Subject: Re: pg_dump | pg_sql: insert commands and foreign key constraints
Next
From: Tommy Gildseth
Date:
Subject: Re: Minor bug/inconveniance with restore from backup, using PITR base backup and archived wal files