Re: [EXTERNAL] Re: pg_stat_statements - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: [EXTERNAL] Re: pg_stat_statements |
Date | |
Msg-id | 20220506020831.yzk55vmhxehra6ll@jrouhaud Whole thread Raw |
In response to | RE: [EXTERNAL] Re: pg_stat_statements ("Godfrin, Philippe E" <Philippe.Godfrin@nov.com>) |
List | pgsql-hackers |
On Thu, May 05, 2022 at 12:21:41PM +0000, Godfrin, Philippe E wrote: > > Thanks very much for looking closely at this. To answer your questions: > I misspoke the query file at the time of the queries above was around 1GB. Ah, that's clearly big enough to lead to some slowdown. > I don't believe I am short on RAM, although I will re-examine that aspect. I'm running 32GB > with a 22GB shared pool, which seems OK to me. The disk are SSD (AWS EBS) and > the disk volumes are the same as the data volumes. If a regular file on disk at 1GB > took 6 seconds to read, the rest of the system would be in serious degradation. I don't know what you mean by shared pool, and you also didn't give any kind of information about your postgres usage, workload, number of connections or anything so it's impossible to know. Note that if your system is quite busy you could definitely have some IO saturation, especially if that file is discarded from OS cache, so I wouldn't blindly rule that possibility out. I suggested multiple ways to try to figure out if that's the problem though, so having such answer would be better than guessing if IO or the "AWS EBS" (which I also don't know anything about) is a problem or not. > The impact on running queries was observed when the max was set at 1000. I don't > quite understand what you keep saying about evictions and other things relative to the > pgss file. Can you refer me to some detailed documentation or a good article which > describes the processes you're alluding to? I don't think there's any thorough documentation or article explaining how pg_stat_statements works internally. But you have a maximum number of different (identified by userid, dbid, queryid) entries stored, so if your workload leads to more entries than the max then pg_stat_statements will have to evict the least used ones to store the new one, and that process is costly and done using some exclusive lwlock. You didn't say which version of postgres you're using, but one thin you can do to see if you probably have eviction is to check the number of rows in pg_stat_statements view. If the number changes very often and is always close to pg_stat_statements.max then you probably have frequent evictions. > Insofar as querying the stats table every 10 seconds, I think that's not aggressive enough as > I want to have statement monitoring as close to realtime as possible. What problem are you trying to solve? Why aren't you using pg_stat_activity if you want realtime overview of what is happening? > You are indeed correct insofar as unknowns - the biggest one for me is I don't know > enough about how the stats extension works - as I asked before, more detail on the > internals of the extension would be useful. Is my only choice in that regard to browse > the source code? I think so. > Short of running the profile that should deal with the unknowns, any other ideas? Do you mean using perf, per the "profiling with perf" wiki article? Other than that I suggested other ways to try to narrow down the problem, what was the outcome for those?
pgsql-hackers by date: