Hi,
we have configured a synchronous master-slave replication.
Node "postgres1" is the master, while node "postgres2" is the slave.
It works fine, but when restarting the master (using "pg_ctl restart"),
the replication is broken. In the slave log we have the following message:
2012-07-16 13:10:20 CESTFATAL: replication terminated by primary server
There is a process in the slave which indicates the file the slave is
waiting for:
[postgres@postgres2 walpostgres1]$ ps -ef | grep start
postgres 1874 1872 0 Jun19 ? 00:01:04 postgres: startup
process waiting for 000000010000000C000000A6
Once we manually copy this file from the master (in path
$PGDATA/pg_xlog) to the archive location (in the slave), the replication
is resumed.
[postgres@postgres1 pg_xlog]$ scp 000000010000000C000000A6
postgres2:/home/postgres/walpostgres1
000000010000000C000000A6 100% 16MB 16.0MB/s 00:01
And in the slave log:
2012-07-16 13:16:03 CESTLOG: restored log file
"000000010000000C000000A6" from archive
2012-07-16 13:16:03 CESTLOG: record with zero length at C/A60000B0
2012-07-16 13:16:03 CESTLOG: unexpected pageaddr C/84000000 in log file
12, segment 166, offset 0
2012-07-16 13:16:03 CESTLOG: streaming replication successfully
connected to primary
Is this a configuration issue?
The archive_command in postgresql.conf in the master is:
archive_command = 'scp %p postgres2:/home/postgres/walpostgres1/%f'
Thank you
Best regards,
Nicolau