Home > mailing lists

scripted 'pg_ctl start' hangs and never finishes, goes - Mailing list pgsql-general

From	Brian Fehrle
Subject	scripted 'pg_ctl start' hangs and never finishes, goes
Date	March 28, 2012 22:32:06
Msg-id	4F739159.4020808@consistentstate.com Whole thread Raw
Responses	Re: scripted 'pg_ctl start' hangs and never finishes, goes
List	pgsql-general

Tree view

Hi all,

OS: Linux 64bit
PostgreSQL Version: 9.0.5 installed from source.

I'm writing up a process that will bring down a warm standby cluster, tarball the data directory, then bring the warm standby back up. I'm having an issue where starting the database with pg_ctl results in the command never exiting. The warmstandby does come back online and starts recovering WAL files (evident in the log), however the command just does not exit. When I ctl -c from the script, the database receives a "fast shutdown".

Basic script logic:
pg_ctl -D /path/to/datadir stop -m fast

cd /path/to/datadir/
tar -czvf /backups/mydatabase.tar.gz *

pg_ctl -D /path/to/datadir start

Originally, I was performing the 'pg_ctl start' over ssh from another box, but I ran into this issue and just assumed it had something to do with doing it over ssh. Now I'm doing it on the actual database box from a perl script and I've started having the same issue.

I'm testing this on a very small database, 2 megs in size. When I execute each event manually, it works just fine.

Actual perl code:
my $output = qx(/bin/pg_ctl -D $dataDir start 2>&1);

The last thing, while the command is 'hung', I search for a running pg_ctl process and come back with:
[postgres@gridpoint_4 bin]$ ps aux | grep pg_ctl
postgres 601 0.0 0.0 0 0 pts/3 Z+ 15:26 0:00 [pg_ctl] <defunct>
postgres 619 0.0 0.0 61180 748 pts/2 S+ 15:26 0:00 grep pg_ctl

Below is the log from the warmstandby as the actions take place.

LOG: received fast shutdown request
LOG: shutting down
LOG: database system is shut down
LOG: database system was shut down in recovery at 2012-03-28 15:02:43 PDT
LOG: starting archive recovery
LOG: restored log file "000000010000000000000057" from archive
LOG: redo starts at 0/57000240
LOG: consistent recovery state reached at 0/58000000

<Here is where, after 5 minutes of waiting, I ctl-C the process>

LOG: received fast shutdown request
LOG: shutting down
LOG: database system is shut down

Any thoughts on what could be the issues? This has happened on the same environment whether I'm doing it from within perl on the actual cluster, or over an ssh command such as ssh user@standby "pg_ctl -D /path/to/data/ start". What's in common is that the pg_ctl becomes a child process of something other than my own shell, could that be the issue?

Thanks in advance,

- Brian F

pgsql-general by date:

From: Naoko Reeves
Date: 28 March 2012, 22:32:00
Subject: could not read block... how could I identify/fix

From: Brian Fehrle
Date: 28 March 2012, 23:33:53
Subject: Re: scripted 'pg_ctl start' hangs and never finishes, goes

scripted 'pg_ctl start' hangs and never finishes, goes - Mailing list pgsql-general

Previous

Next