Re: Add min and max execute statement time in pg_stat_statement - Mailing list pgsql-hackers

From David G Johnston
Subject Re: Add min and max execute statement time in pg_stat_statement
Date
Msg-id 1421796737983-5834805.post@n5.nabble.com
Whole thread Raw
In response to Re: Add min and max execute statement time in pg_stat_statement  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: Add min and max execute statement time in pg_stat_statement
Re: Add min and max execute statement time in pg_stat_statement
Re: Add min and max execute statement time in pg_stat_statement
List pgsql-hackers
Andrew Dunstan wrote
> On 01/20/2015 01:26 PM, Arne Scheffer wrote:
>>
>> And a very minor aspect:
>> The term "standard deviation" in your code stands for
>> (corrected) sample standard deviation, I think,
>> because you devide by n-1 instead of n to keep the
>> estimator unbiased.
>> How about mentioning the prefix "sample"
>> to indicate this beiing the estimator?
> 
> 
> I don't understand. I'm following pretty exactly the calculations stated 
> at <http://www.johndcook.com/blog/standard_deviation/>
> 
> 
> I'm not a statistician. Perhaps others who are more literate in 
> statistics can comment on this paragraph.

I'm largely in the same boat as Andrew but...

I take it that Arne is referring to:

http://en.wikipedia.org/wiki/Bessel's_correction

but the mere presence of an (n-1) divisor does not mean that is what is
happening.  In this particular situation I believe the (n-1) simply is a
necessary part of the recurrence formula and not any attempt to correct for
sampling bias when estimating a population's variance.  In fact, as far as
the database knows, the values provided to this function do represent an
entire population and such a correction would be unnecessary.  I guess it
boils down to whether "future" queries are considered part of the population
or whether the population changes upon each query being run and thus we are
calculating the ever-changing population variance.  Note point 3 in the
linked Wikipedia article.

David J.



--
View this message in context:
http://postgresql.nabble.com/Add-min-and-max-execute-statement-time-in-pg-stat-statement-tp5774989p5834805.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



pgsql-hackers by date:

Previous
From: Andrew Gierth
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)
Next
From: Robert Haas
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)