Thread: Enhancement Request
I'd like to propose two enhancements to the PostgreSQL code, but I'm not sure if this is the correct mailing list. So if it's not then please let me know where I need to post this.
These are monitoring-centric enhancement requests since I'm trying to implement accurate monitoring in a secure fashion.
* General monitoring:
We have a need for a "monitoring" role in PostgreSQL that has read-only access to any "pg_stat" view. As of 9.4, only a super-user can read all columns of "pg_stat_activity", "pg_stat_replication", and "pg_stat_archiver" (there may be other restricted views as well). These views provide critical insight on how well the cluster is operating and what is going on.
There appears to be only two ways to gain access to these views:
- grant super-user to my monitoring user
- write custom functions owned by a super-user with "SECURITY DEFINER" and grant access to my monitoring user
Option 1 is too big of a security risk to grant to a monitoring user. If the monitoring system gets compromised then the DB will be at risk too.
Option 2 requires creating, deploying, and maintaining sets of monitoring functions, which is a decent chunk of work in a large environment with dozens or hundreds of deployments, many running different versions of postgres possibly needing custom versions of the functions. When you add the bureaucracy of a large IT organization and SOX and PCI compliance requirements it ends up being a PITA implementing or changing these functions when you only have a small DBA team.
* Streaming Replication Monitoring:
Make the "pg_stat_replication" view more persistent (maybe keep the rows for 24 hours or have a registration process?).
There appears to be no way to *accurately* monitor streaming replication via SQL alone currently. This is due to three different problems:
- When a standby gets disconnected from the master then the corresponding row is immediately removed from pg_stat_replication on the master. Once the row is gone you cannot tell via simple SQL whether a standby is behind or not (or that it even existed at all) without storing prior values and extrapolating from them.
- On the standby, if the WAL streaming process gets disconnected from the master then it does not report that it is behind because pg_last_xlog_receive_location() has not been updated from the master. The standby has no idea how far ahead the master has gotten and just blindly reports the last value.
- On a "quiet" system there may not be any update activity on the master for a long time, which makes the pg_last_xact_replay_timestamp() function report an ever-increasing interval. So it is not useable for accurately measuring lag on quiet systems.
Ideally the master should be able to report standby lag time via SQL, even when there has been a disruption in connectivity with a standby.
The only accurate method I have found to measure standby lag is to create a synthetic update that runs periodically. This works, but is less than ideal and requires adding a table to every cluster (which then has to be vacuumed frequently too) and writing and maintaining a process to update it and purge it.
These two enhancements would go a long way in making it easier to monitor PostgreSQL clusters and replication because it would eliminate a lot of custom coding requirements and enable us to pull metrics directly via simple SQL.
If anyone can provide insight on how I could accomplish these in a simple manner by other means then I'm all ears!
Thanks,
Rob
On Tue, Apr 19, 2016 at 4:37 PM, Rob Brucks <rob.brucks@rackspace.com> wrote:
I'd like to propose two enhancements to the PostgreSQL code, but I'm not sure if this is the correct mailing list. So if it's not then please let me know where I need to post this.These are monitoring-centric enhancement requests since I'm trying to implement accurate monitoring in a secure fashion.* General monitoring:We have a need for a "monitoring" role in PostgreSQL that has read-only access to any "pg_stat" view. As of 9.4, only a super-user can read all columns of "pg_stat_activity", "pg_stat_replication", and "pg_stat_archiver" (there may be other restricted views as well). These views provide critical insight on how well the cluster is operating and what is going on.There appears to be only two ways to gain access to these views:
- grant super-user to my monitoring user
- write custom functions owned by a super-user with "SECURITY DEFINER" and grant access to my monitoring user
Option 1 is too big of a security risk to grant to a monitoring user. If the monitoring system gets compromised then the DB will be at risk too.Option 2 requires creating, deploying, and maintaining sets of monitoring functions, which is a decent chunk of work in a large environment with dozens or hundreds of deployments, many running different versions of postgres possibly needing custom versions of the functions. When you add the bureaucracy of a large IT organization and SOX and PCI compliance requirements it ends up being a PITA implementing or changing these functions when you only have a small DBA team.* Streaming Replication Monitoring:Make the "pg_stat_replication" view more persistent (maybe keep the rows for 24 hours or have a registration process?).There appears to be no way to *accurately* monitor streaming replication via SQL alone currently. This is due to three different problems:
- When a standby gets disconnected from the master then the corresponding row is immediately removed from pg_stat_replication on the master. Once the row is gone you cannot tell via simple SQL whether a standby is behind or not (or that it even existed at all) without storing prior values and extrapolating from them.
- On the standby, if the WAL streaming process gets disconnected from the master then it does not report that it is behind because pg_last_xlog_receive_location() has not been updated from the master. The standby has no idea how far ahead the master has gotten and just blindly reports the last value.
- On a "quiet" system there may not be any update activity on the master for a long time, which makes the pg_last_xact_replay_timestamp() function report an ever-increasing interval. So it is not useable for accurately measuring lag on quiet systems.
Ideally the master should be able to report standby lag time via SQL, even when there has been a disruption in connectivity with a standby.The only accurate method I have found to measure standby lag is to create a synthetic update that runs periodically. This works, but is less than ideal and requires adding a table to every cluster (which then has to be vacuumed frequently too) and writing and maintaining a process to update it and purge it.These two enhancements would go a long way in making it easier to monitor PostgreSQL clusters and replication because it would eliminate a lot of custom coding requirements and enable us to pull metrics directly via simple SQL.If anyone can provide insight on how I could accomplish these in a simple manner by other means then I'm all ears!Thanks,Rob
Rob,
Enhancement or feature requests should probably go to Custormer Feedbackhttps://postgresql.uservoice.com/forums/21853-general
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Enhancement or feature requests should probably go to Custormer Feedback
https://postgresql.uservoice.com/forums/21853-general
I'm on the PostgreSQL home page: what series of links would lead to "uservoice.com" so that I can submit my feature request?
If that site is going to be recommended over posting to pgsql-general it should feature prominently on the main website.
I've seen or heard nothing that suggests anywhere except pgsql-general, this list, is the correct place to submit such requests. I do think we are lacking sufficient communication in this area but would rather see that corrected and formalized before we go pointing people to third-party sites.
David J.
Melvin Davidson wrote: > On Tue, Apr 19, 2016 at 4:37 PM, Rob Brucks <rob.brucks@rackspace.com> wrote: >> I'd like to propose two enhancements to the PostgreSQL code, but I'm not sure if this is the >> correct mailing list. So if it's not then please let me know where I need to post this. I'd post feature requests to the -hackers list, but -general is usually fine too. >> * General monitoring: >> We have a need for a "monitoring" role in PostgreSQL that has read-only access to any "pg_stat" >> view. As of 9.4, only a super-user can read all columns of "pg_stat_activity", "pg_stat_replication", >> and "pg_stat_archiver" (there may be other restricted views as well). These views provide critical >> insight on how well the cluster is operating and what is going on. >> >> There appears to be only two ways to gain access to these views: >> >> 1. grant super-user to my monitoring user >> 2. write custom functions owned by a super-user with "SECURITY DEFINER" and grant access to >> my monitoring user I think #2 is good enough. To reduce the pain of deploying such a function, you can create it in "template1" and CREATE DATABASE will automatically copy it. >> * Streaming Replication Monitoring: >> The only accurate method I have found to measure standby lag is to create a synthetic update >> that runs periodically. This works, but is less than ideal and requires adding a table to every >> cluster (which then has to be vacuumed frequently too) and writing and maintaining a process to update >> it and purge it. I second that, I have wished for something like that myself. You don't actually have to create a table for this, a simple "SELECT txid_current()" on the primary will do the trick. Still it is annoying. > Enhancement or feature requests should probably go to Custormer Feedback > <https://postgresql.uservoice.com/forums/21853-general> I didn't know that site. Was it ever announced? Yours, Laurenz Albe
On 4/20/2016 12:28 AM, Albe Laurenz wrote: >> >Enhancement or feature requests should probably go to Custormer Feedback >> ><https://postgresql.uservoice.com/forums/21853-general> > I didn't know that site. > Was it ever announced? first I've heard of it, was Melvin suggesting it in another thread here the other day. who runs/owns uservoice, what are they in this for? Who setup this postgresql uservoice site? -- john r pierce, recycling bits in santa cruz
On Wed, Apr 20, 2016 at 3:47 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Enhancement or feature requests should probably go to Custormer Feedback
https://postgresql.uservoice.com/forums/21853-generalI'm on the PostgreSQL home page: what series of links would lead to "uservoice.com" so that I can submit my feature request?If that site is going to be recommended over posting to pgsql-general it should feature prominently on the main website.I've seen or heard nothing that suggests anywhere except pgsql-general, this list, is the correct place to submit such requests. I do think we are lacking sufficient communication in this area but would rather see that corrected and formalized before we go pointing people to third-party sites.
It is interesting that this channel was opened at least in the middle 2009 or before (judging by the feature/comments timestamps). It would be nice if uservoice provided any hint or contact for the owner of the channel, because right now it appears to be a total mystery.
I'm not sure if the owner is subscribed to -general, I wouldn't be surprised if not. Otherwise, please speak up. :-)
I think that such user communication channel can be useful if maintained properly, and if the rest of the community would be aware of its existence. What I dislike right now, is that it pretends to be official channel without any disclaimer or references to the actual official communication channels (which are the mailing lists and IRC, AFAIK).
Regards,
--
Alex
Rob, * Rob Brucks (rob.brucks@rackspace.com) wrote: > I'd like to propose two enhancements to the PostgreSQL code, but I'm not sure if this is the correct mailing list. Soif it's not then please let me know where I need to post this. This is the correct place. I don't know why people are suggesting third party sites, but the correct place is -general, as you've done, which is fantastic. > These are monitoring-centric enhancement requests since I'm trying to implement accurate monitoring in a secure fashion. I've been working on exactly this problem and 9.6 will (finally) have the start of work to improve PostgreSQL in this area. Your thoughts and use-cases are exactly what we need to continue that effort and get to a point where monitoring solutions can depend on PostgreSQL to provide the features, capabilities, and information which they need, without requiring the monitoring user to be a superuser. > * General monitoring: > We have a need for a "monitoring" role in PostgreSQL that has read-only access to any "pg_stat" view. As of 9.4, onlya super-user can read all columns of "pg_stat_activity", "pg_stat_replication", and "pg_stat_archiver" (there may beother restricted views as well). These views provide critical insight on how well the cluster is operating and what isgoing on. That was proposed and was called 'pg_monitor'. Unfortunately, through a lack of support and questions about such a role possibly being "too broad", it ended up not being included for 9.6. I'd very much like to work through those issues and find a solution for post-9.6 (note that we are past the feature freeze point for 9.6, so any further changes will be for later versions). > * Streaming Replication Monitoring: > Make the "pg_stat_replication" view more persistent (maybe keep the rows for 24 hours or have a registration process?). I believe this is improved when working with replication slots in recent versions. > If anyone can provide insight on how I could accomplish these in a simple manner by other means then I'm all ears! Please continue to engage with the PostgreSQL community on these issues. I agree that these are critical features which we really need to have and will continue to work on them, but support from users, particularly with detailed real-world use-caes, goes a very long way. Thanks! Stephen