Re: Add min and max execute statement time in pg_stat_statement - Mailing list pgsql-hackers

From Gavin Flower
Subject Re: Add min and max execute statement time in pg_stat_statement
Date
Msg-id 52685B24.40409@archidevsys.co.nz
Whole thread Raw
In response to Re: Add min and max execute statement time in pg_stat_statement  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-hackers
On 24/10/13 12:14, Jeff Janes wrote:
On Wed, Oct 23, 2013 at 4:00 PM, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
On 24/10/13 11:26, Peter Geoghegan wrote:
On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
Looks definitely bimodal in the log version, very clear!

Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera
suggested) would be very useful.
I'm having a hard time imagining how you'd actually implement that.
For example, this:

https://wiki.postgresql.org/wiki/Aggregate_Histogram

requires that a "limit" be specified ahead of time. Is there a
principled way to increase or decrease this kind of limit over time,
and have the new buckets contents "spill into each other"?

To smplify things, I'm using 5 buckets, but 32 would be better.
Assume first bucket width is 1ms.

bucket     range
  0              x =< 1ms
  1        1ms < x =< 2ms
  2        2ms < x =< 4ms
  3        4ms < x =< 8ms
  5        8ms < x

The last bucket would be limited to 8ms < x <= 16 ms.  If you find something > 16ms, then you have to rescale *before* you increment any of the buckets.  Once you do, there is now room to hold it.

bucket     range
  0              x =< 2ms   (sum of previous bucket 0 and previous bucket 1)
  1        2ms < x =< 4ms
  2        4ms < x =< 8ms
  3        8ms < x =< 16ms
  4        16ms < x =< 32ms  (starts empty)

Cheers,

Jeff
It is very important that the last bucket be unbounded, or you may lose potentially important data.  Especially if one asumes that all durations will fit into the first n - 1 buckets, in which case being alerted to the asumption being siginificantly wrong is crucial!

The logic to check on the values for the last bucket is trivial, so there is no need to have an upper limit for it.


Cheers,
Gavin



pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: CLUSTER FREEZE
Next
From: Stephen Frost
Date:
Subject: Re: Add min and max execute statement time in pg_stat_statement