There is also http://bucardo.org/wiki/Check_postgres but I haven't been able to get it to work for monitoring replication. I am using a similar custom script as Mahlon, but written in perl. Looking at Mahlon's code has shown me an error in how I have been thinking about calculating the replication lag. Thanks :)
On Wed, Oct 12, 2011 at 3:28 PM, Mahlon E. Smith <mahlon@martini.nu> wrote:
On Wed, Oct 12, 2011, Brandon Phelps wrote:
> I use Nagios to monitor various things on a few servers and have > recently set up a hot-standby server and would obviously like to > include the state of streaming replication in my monitoring. >
> [...]
> > The confusion I have is how exactly can I determine just how far > behind the replication is during loads? Currently with no traffic > (servers not in production yet) sent_location on the master is > "A/10018560" and pg_last_xlog_receive_location() on the standby also > returns "A/10018560"... How far apart can these be for me to start > worrying? I could make a bit more sense of all this if they were > simple timestamps or something, but the hex values returned boggle my > mind. > > Any advice on these issues or other tips on monitoring the replication > would be greatly appreciated.
Brandon: I'm using this script for Mon, you should be able to adapt it to whatever language and monitoring system you please.