Thread: Desperately need a magical PG monitoring tool
Hi, is there a tool for monitoring PG servers? Something that checks if a master and a hot-standby is still running flawlessly. I'd be nice if one could generously overlook some errors like wrong passwords. The standby should be watched if it keeps in sync with the master. I have a cron job that does regularly pg_dumps and compresses them by 7zip. I'd like to check automatically if this backup got done and wether the archive is actually OK. I'm not fixed on this combination. Any compressing backup solution would do. Last week I had memory allocation errors in the db-server that eventually crashed PG after a couple of automatical restarts. After a manual restart thursday PG at first seemed to run OK but later it prooved that at least one table got messed up so the nightly pg_dump failed. I found out on friday and ended up with the last intact backup from wednesday night. :( Until today I had no standby server so this wasn't really pleasant. Is there a possibility to get ASAP informed when such a data corruption happens? How do you watch that all runs well ?
On 26/03/12 19:58, Andreas wrote: > Hi, > > is there a tool for monitoring PG servers? > How do you watch that all runs well ? There are a number of tools. You might want to google around: - nagios - monit - munin There are plenty of others Nagios is aimed at multi-server service monitoring (and alerting). So you can keep track of 20 websites on 5 different servers etc. Monit is more focused on monitoring/alerting/restarting on a single server. Munin is about performance tracking and graphing. You can set it up to alert if parameters get outside a set range. For your scenario, I'd consider restoring the backup to another database (on another server perhaps) and checking some suitable value (e.g. a max timestamp in a frequently updated table). You could do all this from a simple cron-job + perl script but you might want to consider one of the tools mentioned above. -- Richard Huxton Archonet Ltd