> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-
> owner@postgresql.org] On Behalf Of Bill MacArthur
> Sent: Friday, August 26, 2011 10:21 AM
> To: pgsql-admin@postgresql.org
> Subject: [ADMIN] how to tell if a replication server has stopped
> replicating
>
> Hello,
>
> We recently discovered, quite by accident, that our streaming
> replication server was no longer replicating. We noticed this in our
> master server log file:
> 2011-08-26 00:00:05 PDT postgres 192.168.17.4 [unknown]LOG:
> replication connection authorized: user=postgres host=192.168.17.4
> port=53542
> 2011-08-26 00:00:05 PDT postgres 192.168.17.4 [unknown]FATAL:
> requested WAL segment 00000001000001D10000006B has already been removed
>
> As it turned out this has been going on for at least a week as
> everyday's log files were crammed with these messages. Whatever caused
> the replication server to end up needing the WAL file is a mystery for
> another day. What I would like to do is setup a simple method of
> alerting us if replication stops. We could do a simple grep of log
> files on the replication side, but I am guessing that there is some SQL
> command that could be run against the postgres internals that would be
> cleaner. Is there such an animal?
>
> Thank you,
> Bill MacArthur
>
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00198.php
* http://archives.postgresql.org/pgsql-hackers/2010-11/msg00252.php
Those two posts should cover the basics. There are other ways some people use to do it, but this seems to be the
generallyaccepted way.
I think 9.1 has some stuff in the works that should make it far easier to monitor.
-Mark