Re: Q:Aggregrating Weekly Production Data. How do you do it? - Mailing list pgsql-general

From John D. Burger
Subject Re: Q:Aggregrating Weekly Production Data. How do you do it?
Date
Msg-id 1BF89005-FC4E-4E0A-9B1E-E342B0080EC8@mitre.org
Whole thread Raw
In response to Q:Aggregrating Weekly Production Data. How do you do it?  (Ow Mun Heng <Ow.Mun.Heng@wdc.com>)
List pgsql-general
Ow Mun Heng wrote:

> The results are valid (verified with actual data) but I don't
> understand
> the logic. All the Statistical books I've read marked stdev as sqrt
> (sum(x - ave(x))^2 / (n - 1). The formula is very different, hence the
> confusion.

A formula is not an algorithm.  In particular, the naive way of
calculating variance or standard deviation has massive numerical
instability problems - anything involving sums of squares does.
There are a variety of alternate algorithms for stddev/variance, I
presume your other algorithm is similarly trying to avoid these same
issues (but I have not looked closely at it).  You can also see
Wikipedia for one of the most well known, due to Knuth/Wellford:

   http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

- John D. Burger
   MITRE



pgsql-general by date:

Previous
From: "Sander Steffann"
Date:
Subject: Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER
Next
From: Bill Moran
Date:
Subject: Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER