On Thu, Oct 31, 2019 at 12:51:17PM -0300, Igor Calabria wrote: >Hi everyone, > >I was taking a look at pg_stat_statements module and noticed that it does >not collect any percentile metrics. I believe that It would be really handy >to have those available and I'd love to contribute with this feature. > >The basic idea is to accumulate the the query execution times using an >approximation structure like q-digest or t-digest and add those results to >the pg_stat_statements table as fixed columns. Something like this > >p90_time: >p95_time: >p99_time: >p70_time: >... > >Another solution is to persist de digest structure in a binary column and >use a function to extract the desired quantile ilke this SELECT >approx_quantile(digest_times, 0.99) FROM pg_stat_statements >
IMO having some sort of CDF approximation (being a q-digest or t-digest) would be useful, although it'd probably need to be optional (mostly becuase of memory consumption).
I don't see why we would not store the digests themselves. Storing just some selected percentiles would be pretty problematic due to losing a lot of information on restart. Also, pg_stat_statements is not a table but a view on in-memory hash table.
regards
-- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services