Re: Wait events monitoring future development - Mailing list pgsql-hackers
From | Tsunakawa, Takayuki |
---|---|
Subject | Re: Wait events monitoring future development |
Date | |
Msg-id | 0A3221C70F24FB45833433255569204D1F5C0889@G01JPEXMBYT05 Whole thread Raw |
In response to | Re: Wait events monitoring future development (Jim Nasby <Jim.Nasby@BlueTreble.com>) |
List | pgsql-hackers |
From: pgsql-hackers-owner@postgresql.org > Lets put this in perspective: there's tons of companies that spend thousands > of dollars per month extra by running un-tuned systems in cloud environments. > I almost called that "waste" but in reality it should be a simple business > question: is it worth more to the company to spend resources on reducing > the AWS bill or rolling out new features? > It's something that can be estimated and a rational business decision made. > > Where things become completely *irrational* is when a developer reads > something like "plpgsql blocks with an EXCEPTION handler are more expensive" > and they freak out and spend a bunch of time trying to avoid them, without > even the faintest idea of what that overhead actually is. > More important, they haven't the faintest idea of what that overhead costs > the company, vs what it costs the company for them to spend an extra hour > trying to avoid the EXCEPTION (and probably introducing code that's far > more bug-prone in the process). > > So in reality, the only people likely to notice even something as large > as a 10% hit are those that were already close to maxing out their hardware > anyway. > > The downside to leaving stuff like this off by default is users won't > remember it's there when they need it. At best, that means they spend more > time debugging something than they need to. At worse, it means they suffer > a production outage for longer than they need to, and that can easily exceed > many months/years worth of the extra cost from the monitoring overhead. I'd rather like this way of positive thinking. It will be better to think of the event monitoring as a positive featurefor (daily) proactive improvement, not only as a debugging feature which gives negative image. For example, pgAdmin4can display 10 most time-consuming events and their solutions. The DBA initially places the database and WAL onthe same volume. As the system grows and the write workload increases, the DBA can get a suggestion from pgAdmin4 thathe can prepare for the system growth by placing WAL on another volume to reduce WALWriteLock wait events. This is notdebugging, but proactive monitoring. > > As another idea, we can stand on the middle ground. Interestingly, MySQL > also enables their event monitoring (Performance Schema) by default, but > not all events are collected. I guess highly encountered events are not > collected by default to minimize the overhead. > > That's what we currently do with several track_* and log_*_stats GUCs, > several of which I forgot even existed until just now. Since there's question > over the actual overhead maybe that's a prudent approach for now, but I > think we should be striving to enable these things ASAP. Agreed. And as Bruce said, it may be better to be able to disable collection of some events that have visible impact onperformance. Regards Takayuki Tsunakawa
pgsql-hackers by date: