Thread: restore_command is not running on my standby

restore_command is not running on my standby

From
Joseph Shraibman
Date:
I have twice set up pg hot standbys ala the docs at
http://www.postgresql.org/docs/9.1/interactive/hot-standby.html

The third time I'm trying this I'm running into trouble.  The first two
times were with actual servers.  This time I'm trying to set up two pg
instances on my desktop for testing.

First I create the secondary:

[jks@jks-desktop ~/work/pgmon]{f15}$ time pg_basebackup -D repl-db -P -h
localhost -U replicator
59303/59303 kB (100%), 1/1 tablespace
NOTICE:  pg_stop_backup complete, all required WAL segments have been
archived

real    0m1.725s
user    0m0.061s
sys     0m0.265s

Then I copy in recovery.conf and the replacement postgresql.conf and try
to start up.  I get:


LOG:  database system was interrupted; last known up at 2012-03-13
21:29:32 EDT
LOG:  could not open file "pg_xlog/00000001000000000000003D" (log file
0, segment 61): No such file or directory
LOG:  invalid checkpoint record
FATAL:  could not locate required checkpoint record
HINT:  If you are not restoring from a backup, try removing the file
"/home/jks/work/pgmon/repl-db/backup_label".
LOG:  startup process (PID 28220) exited with exit code 1
LOG:  aborting startup due to startup process failure

Now the file 00000001000000000000003D does exist in the archive
directory, but it appears that restore_command is not being run.
Originally it was:
'cp /home/jks/work/pgmon/wal_drop/%f %p'

Then I changed it to:

restore_command = 'echo f %f p %p >> /tmp/rc.log ; cp
/home/jks/work/pgmon/wal_drop/%f %p'

/tmp/rc.log was never created, so I assume the whole thing isn't being
run for some reason.  Any clues where I should look?

version is:
PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.6.1 20110908 (Red Hat 4.6.1-9), 64-bit

Re: restore_command is not running on my standby

From
Fujii Masao
Date:
On Wed, Mar 14, 2012 at 11:07 AM, Joseph Shraibman <jks@selectacast.net> wrote:
> I have twice set up pg hot standbys ala the docs at
> http://www.postgresql.org/docs/9.1/interactive/hot-standby.html
>
> The third time I'm trying this I'm running into trouble.  The first two
> times were with actual servers.  This time I'm trying to set up two pg
> instances on my desktop for testing.
>
> First I create the secondary:
>
> [jks@jks-desktop ~/work/pgmon]{f15}$ time pg_basebackup -D repl-db -P -h
> localhost -U replicator
> 59303/59303 kB (100%), 1/1 tablespace
> NOTICE:  pg_stop_backup complete, all required WAL segments have been
> archived
>
> real    0m1.725s
> user    0m0.061s
> sys     0m0.265s
>
> Then I copy in recovery.conf and the replacement postgresql.conf and try to
> start up.  I get:
>
>
> LOG:  database system was interrupted; last known up at 2012-03-13 21:29:32
> EDT
> LOG:  could not open file "pg_xlog/00000001000000000000003D" (log file 0,
> segment 61): No such file or directory
> LOG:  invalid checkpoint record
> FATAL:  could not locate required checkpoint record
> HINT:  If you are not restoring from a backup, try removing the file
> "/home/jks/work/pgmon/repl-db/backup_label".
> LOG:  startup process (PID 28220) exited with exit code 1
> LOG:  aborting startup due to startup process failure
>
> Now the file 00000001000000000000003D does exist in the archive directory,
> but it appears that restore_command is not being run. Originally it was:
> 'cp /home/jks/work/pgmon/wal_drop/%f %p'
>
> Then I changed it to:
>
> restore_command = 'echo f %f p %p >> /tmp/rc.log ; cp
> /home/jks/work/pgmon/wal_drop/%f %p'
>
> /tmp/rc.log was never created, so I assume the whole thing isn't being run
> for some reason.  Any clues where I should look?

Confirm that recovery.conf is properly located under the data directory.
If recovery.conf is located properly and standby_mode is enabled there,
you should get the following log message at the start of recovery:

    LOG:  entering standby mode

But you didn't get such message. So I guess that PostgreSQL failed to read
recovery.conf and could not run restore_command because of wrong location
of recovery.conf.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center