Re: Streaming replication status - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: Streaming replication status
Date
Msg-id 4B5010DE.50802@kaltenbrunner.cc
Whole thread Raw
In response to Re: Streaming replication status  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Streaming replication status
List pgsql-hackers
Greg Smith wrote:
> Fujii Masao wrote:
>>> "I'm thinking something like pg_standbys_xlog_location() [on the primary] which returns
>>> one row per standby servers, showing pid of walsender, host name/
>>> port number/user OID of the standby, the location where the standby
>>> has written/flushed WAL. DBA can measure the gap from the
>>> combination of pg_current_xlog_location() and pg_standbys_xlog_location()
>>> via one query on the primary."
>>>     
>>
>> This function is useful but not essential for troubleshooting, I think.
>> So I'd like to postpone it.
>>   
> 
> Sure; in a functional system where primary and secondary are both up, 
> you can assemble the info using the new functions you just added, so 
> this other one is certainly optional.  I just took a brief look at the 
> code of the features you added, and it looks like it exposes the minimum 
> necessary to make this whole thing possible to manage.  I think it's OK 
> if you postpone this other bit, more important stuff for you to work on.

agreed

> 
> So:  the one piece of information I though was most important to expose 
> here at an absolute minimum is there now.  Good progress.  The other 
> popular request that keeps popping up here is  providing an easy way to 
> see how backlogged the archive_command is, to make it easier to monitor 
> for out of disk errors that might prove catastrophic to replication.

I tend to disagree - in any reasonable production setup basic stulff 
like disk space usage is monitored by non-application specific matters.
While monitoring backlog might be interesting for other reasons, citing 
disk space usage/exhaustions seems just wrong.


[...]
> 
> I'd find this extremely handy as a hook for monitoring scripts that want 
> to watch the server but don't have access to the filesystem directly, 
> even given those limitations.  I'd prefer to have the "tried to" 
> version, because it will populate with the name of the troublesome file 
> it's stuck on even if archiving never gets its first segment delivered.

While fancy at all I think this goes way to far for the first cut at 
SR(or say this release), monitoring disk usage and tracking log files 
for errors are SOLVED issues in estabilished production setups. If you 
are in an environment that does neither for each and every server 
independent on what you have running on it, or a setup where the 
sysadmins are clueless and the poor DBA has to hack around that fact you 
have way bigger issues anyway.


Stefan


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: per-user pg_service.conf
Next
From: Boszormenyi Zoltan
Date:
Subject: Re: lock_timeout GUC patch