Re: Monitoring Replication - Mailing list pgsql-general

From Mark Keisler
Subject Re: Monitoring Replication
Date
Msg-id CAGNWxWpHVeh=nF-RkZp8gk-aagzYGgVkHk5KqoDGmzgDA3OznQ@mail.gmail.com
Whole thread Raw
In response to Re: Monitoring Replication  ("Mahlon E. Smith" <mahlon@martini.nu>)
List pgsql-general
There is also http://bucardo.org/wiki/Check_postgres but I haven't been able to get it to work for monitoring replication.  I am using a similar custom script as Mahlon, but written in perl.  Looking at Mahlon's code has shown me an error in how I have been thinking about calculating the replication lag.  Thanks :)



On Wed, Oct 12, 2011 at 3:28 PM, Mahlon E. Smith <mahlon@martini.nu> wrote:
On Wed, Oct 12, 2011, Brandon Phelps wrote:

> I use Nagios to monitor various things on a few servers and have
> recently set up a hot-standby server and would obviously like to
> include the state of streaming replication in my monitoring.
>
> [...]
>
> The confusion I have is how exactly can I determine just how far
> behind the replication is during loads?  Currently with no traffic
> (servers not in production yet) sent_location on the master is
> "A/10018560" and pg_last_xlog_receive_location() on the standby also
> returns "A/10018560"... How far apart can these be for me to start
> worrying?  I could make a bit more sense of all this if they were
> simple timestamps or something, but the hex values returned boggle my
> mind.
>
> Any advice on these issues or other tips on monitoring the replication
> would be greatly appreciated.


Brandon:  I'm using this script for Mon, you should be able to adapt it
to whatever language and monitoring system you please.

http://www.martini.nu/misc/db_replication.monitor.txt

--
Mahlon E. Smith
http://www.martini.nu/contact.html

pgsql-general by date:

Previous
From: Mark Keisler
Date:
Subject: Re: How to make replica and use it when master is down ?
Next
From: Ivan Voras
Date:
Subject: Re: Bulk processing & deletion