Thread: Desperately need a magical PG monitoring tool

Desperately need a magical PG monitoring tool

From
Andreas
Date:
Hi,

is there a tool for monitoring PG servers?

Something that checks if a master and a hot-standby is still running
flawlessly.

I'd be nice if one could generously overlook some errors like wrong
passwords.

The standby should be watched if it keeps in sync with the master.

I have a cron job that does regularly pg_dumps and compresses them by 7zip.
I'd like to check automatically if this backup got done and wether the
archive is actually OK.
I'm not fixed on this combination. Any compressing backup solution would do.

Last week I had memory allocation errors in the db-server that
eventually crashed PG after a couple of automatical restarts.

After a manual restart thursday PG at first seemed to run OK but later
it prooved that at least one table got messed up so the nightly pg_dump
failed. I found out on friday and ended up with the last intact backup
from wednesday night.  :(
Until today I had no standby server so this wasn't really pleasant.

Is there a possibility to get ASAP informed when such a data corruption
happens?


How do you watch that all runs well ?

Re: Desperately need a magical PG monitoring tool

From
Richard Huxton
Date:
On 26/03/12 19:58, Andreas wrote:
> Hi,
>
> is there a tool for monitoring PG servers?

> How do you watch that all runs well ?

There are a number of tools. You might want to google around:
- nagios
- monit
- munin
There are plenty of others

Nagios is aimed at multi-server service monitoring (and alerting). So
you can keep track of 20 websites on 5 different servers etc.

Monit is more focused on monitoring/alerting/restarting on a single server.

Munin is about performance tracking and graphing. You can set it up to
alert if parameters get outside a set range.


For your scenario, I'd consider restoring the backup to another database
(on another server perhaps) and checking some suitable value (e.g. a max
timestamp in a frequently updated table). You could do all this from a
simple cron-job + perl script but you might want to consider one of the
tools mentioned above.

--
   Richard Huxton
   Archonet Ltd