We've found pghero to be a good first line of defence. It doesn't have alerting yet, but it's great for a quick high level healthcheck.
Also +1 for Datadog. Extremely flexible and elegant UI + powerful alerting capabilities.
On Fri, May 26, 2017 at 10:32 AM, Sunkara, Amrutha <amrutha@nytimes.com> wrote:
We have been using Nagios to monitor the system level stats. The database level stats that we gather are custom scripts that we have nagios poll to get the database health. You could use pg badger to generate reports against your database logs as well. Pg_badger reports are your bffs for performance related specs.. very close to AWR reports that oracle provides.
Sotrage/Disk latencies -- we have oracle's os watcher we running regularly on these hosts to generate iostats as well.