Thread: how to tell if a replication server has stopped replicating
Hello, We recently discovered, quite by accident, that our streaming replication server was no longer replicating. We noticed thisin our master server log file: 2011-08-26 00:00:05 PDT postgres 192.168.17.4 [unknown]LOG: replication connection authorized: user=postgres host=192.168.17.4port=53542 2011-08-26 00:00:05 PDT postgres 192.168.17.4 [unknown]FATAL: requested WAL segment 00000001000001D10000006B has alreadybeen removed As it turned out this has been going on for at least a week as everyday's log files were crammed with these messages. Whatevercaused the replication server to end up needing the WAL file is a mystery for another day. What I would like to dois setup a simple method of alerting us if replication stops. We could do a simple grep of log files on the replicationside, but I am guessing that there is some SQL command that could be run against the postgres internals that wouldbe cleaner. Is there such an animal? Thank you, Bill MacArthur
> -----Original Message----- > From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin- > owner@postgresql.org] On Behalf Of Bill MacArthur > Sent: Friday, August 26, 2011 10:21 AM > To: pgsql-admin@postgresql.org > Subject: [ADMIN] how to tell if a replication server has stopped > replicating > > Hello, > > We recently discovered, quite by accident, that our streaming > replication server was no longer replicating. We noticed this in our > master server log file: > 2011-08-26 00:00:05 PDT postgres 192.168.17.4 [unknown]LOG: > replication connection authorized: user=postgres host=192.168.17.4 > port=53542 > 2011-08-26 00:00:05 PDT postgres 192.168.17.4 [unknown]FATAL: > requested WAL segment 00000001000001D10000006B has already been removed > > As it turned out this has been going on for at least a week as > everyday's log files were crammed with these messages. Whatever caused > the replication server to end up needing the WAL file is a mystery for > another day. What I would like to do is setup a simple method of > alerting us if replication stops. We could do a simple grep of log > files on the replication side, but I am guessing that there is some SQL > command that could be run against the postgres internals that would be > cleaner. Is there such an animal? > > Thank you, > Bill MacArthur > * http://archives.postgresql.org/pgsql-hackers/2010-11/msg00198.php * http://archives.postgresql.org/pgsql-hackers/2010-11/msg00252.php Those two posts should cover the basics. There are other ways some people use to do it, but this seems to be the generallyaccepted way. I think 9.1 has some stuff in the works that should make it far easier to monitor. -Mark