Re: Streaming replication status - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: Streaming replication status
Date
Msg-id 4B51F5F1.2050906@kaltenbrunner.cc
Whole thread Raw
In response to Re: Streaming replication status  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
Kevin Grittner wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
>> Kevin Grittner wrote:
>  
>>> Right, we don't want to give the monitoring software an OS login
>>> for the database servers, for security reasons.
>> depending on what you exactly mean by that I do have to wonder how
>> you monitor more complex stuff (or stuff that require elevated
>> privs) - say raid health, multipath configuration, status of OS
>> level updates, "are certain processes running or not" as well as
>> basic parameters like CPU or IO load. as in stuff you cannot know
>> usless you have it exported through "some" port.
>  
> Many of those are monitored on the server one way or another,
> through a hardware card accessible only to the DBAs.  The card sends
> an email to the DBAs for any sort of distress, including impending
> or actual drive failure, ambient temperature out of bounds, internal
> or external power out of bounds, etc.  OS updates are managed by the
> DBAs through scripts.  Ideally we would tie these in to our opcenter
> software, which displays status through hundreds of "LED" boxes on
> big plasma displays in our support areas (and can send emails and
> jabber messages when things get to a bad state), but since the
> messages are getting to the right people in a timely manner, this is
> a low priority as far as monitoring enhancement requests go.

well a lot of people (including myself) consider it a necessity to 
aggregate all that stuff in your system monitoring, only that way you 
can guarantee proper dependency handling (ie no need to page for 
"webserver not running" if the whole server is down).
There is also a case to be made for statistics tracking and long term 
monitoring of stuff.


>  
> Only the DBAs have OS logins to database servers.  Monitoring
> software must deal with application ports (which have to be open
> anyway, so that doesn't add any security risk).  Since the hardware
> monitoring doesn't know about file systems, and the disk space on
> database servers is primarily an issue for the database, it made
> sense to us to add the ability to check the space available to the
> database through a database connection.  Hence, fsutil.

still seems very backwards - there is much much more than can only be 
monitored from within the OS(and not from an external 
iLO/RSA/IMM/DRAC/whatever) that you cannot really do from within the 
database (or any other application) so I'm still puzzled...


Stefan


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Archive recovery crashes on win32 in HEAD - hot standby related?
Next
From: Simon Riggs
Date:
Subject: Re: Hot Standby and handling max_standby_delay