Re: Add min and max execute statement time in pg_stat_statement - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Add min and max execute statement time in pg_stat_statement
Date
Msg-id CAMkU=1y3ZZSmLAhBZgPgxmTHa_sr7K6_eqEePt8ceNarcZv9Ow@mail.gmail.com
Whole thread Raw
In response to Re: Add min and max execute statement time in pg_stat_statement  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Oct 21, 2013 at 1:36 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 21, 2013 at 4:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Gavin Flower <GavinFlower@archidevsys.co.nz> writes:
>>>> If we're going to extend pg_stat_statements, even more than min and max
>>>> I'd like to see the standard deviation in execution time.
>
>> How about the 'median', often a lot more useful than the 'arithmetic
>> mean' (which most people call the 'average').
>
> AFAIK, median is impossible to calculate cheaply (in particular, with
> a fixed amount of workspace).  So this apparently innocent request
> is actually moving the goalposts a long way, because the space per
> query table entry is a big concern for pg_stat_statements.

Yeah, and I worry about min and max not being very usable - once they
get pushed out to extreme values, there's nothing to drag them back
toward normality except resetting the stats, and that's not something
we want to encourage people to do frequently. Of course, averages over
very long sampling intervals may not be too useful anyway, dunno.

I think the pg_stat_statements_reset() should be done every time you make a change which you think (or hope) will push the system into a new regime, which goes for either min/max or for average/stdev.

A histogram would be cool, but it doesn't seem very practical to implement.  If I really needed that I'd probably set log_min_duration_statement = 0 and mine the log files.  But that means I'd have to wait to accumulate enough logs once I made that change, then remember to turn it off.

What I'd like most in pg_stat_statements now is the ability to distinguish which queries have a user grinding their teeth, versus which ones have a cron job patiently doing a wait4.  I don't know the best way to figure that out, other than stratify on application_name.  Or maybe a way to selectively undo the query text normalization, so I could see which parameters were causing the problem.

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Failure while inserting parent tuple to B-tree is not fun
Next
From: Andres Freund
Date:
Subject: Re: Failure while inserting parent tuple to B-tree is not fun