OS: Linux 64bit PostgreSQL Version: 9.0.5 installed from source.
I'm writing up a process that will bring down a warm standby cluster, tarball the data directory, then bring the warm standby back up. I'm having an issue where starting the database with pg_ctl results in the command never exiting. The warmstandby does come back online and starts recovering WAL files (evident in the log), however the command just does not exit. When I ctl -c from the script, the database receives a "fast shutdown".
Basic script logic: pg_ctl -D /path/to/datadir stop -m fast
cd /path/to/datadir/ tar -czvf /backups/mydatabase.tar.gz *
pg_ctl -D /path/to/datadir start
Originally, I was performing the 'pg_ctl start' over ssh from another box, but I ran into this issue and just assumed it had something to do with doing it over ssh. Now I'm doing it on the actual database box from a perl script and I've started having the same issue.
I'm testing this on a very small database, 2 megs in size. When I execute each event manually, it works just fine.
Actual perl code: my $output = qx(/bin/pg_ctl -D $dataDir start 2>&1);
The last thing, while the command is 'hung', I search for a running pg_ctl process and come back with: [postgres@gridpoint_4 bin]$ ps aux | grep pg_ctl postgres 601 0.0 0.0 0 0 pts/3 Z+ 15:26 0:00 [pg_ctl] <defunct> postgres 619 0.0 0.0 61180 748 pts/2 S+ 15:26 0:00 grep pg_ctl
Below is the log from the warmstandby as the actions take place.
LOG: received fast shutdown request LOG: shutting down LOG: database system is shut down LOG: database system was shut down in recovery at 2012-03-28 15:02:43 PDT LOG: starting archive recovery LOG: restored log file "000000010000000000000057" from archive LOG: redo starts at 0/57000240 LOG: consistent recovery state reached at 0/58000000
<Here is where, after 5 minutes of waiting, I ctl-C the process>
LOG: received fast shutdown request LOG: shutting down LOG: database system is shut down
Any thoughts on what could be the issues? This has happened on the same environment whether I'm doing it from within perl on the actual cluster, or over an ssh command such as ssh user@standby "pg_ctl -D /path/to/data/ start". What's in common is that the pg_ctl becomes a child process of something other than my own shell, could that be the issue?