Tom Lane wrote:
>Joe Conway <mail@joeconway.com> writes:
>
>>Tom Lane wrote:
>>
>>>I don't have a real strong feeling about whether we should change the
>>>behavior at N=1 or not. Does the SQL200x spec provide any guidance?
>>>
>>The spec does have specific guidance in section
>>10.9 <aggregate function>:
>>
>> 1) If VAR_POP is specified, then the result is (S2-S1*S1/N)/N.
>> 2) If VAR_SAMP is specified, then:
>> A) If N is 1 (one), then the result is the null value.
>> B) Otherwise, the result is (S2-S1*S1/N)/(N-1)
>>
>
>Okay, that probably trumps the Oracle precedent, especially seeing that
>it seems mathematically sounder. I'll make the changes.
>
The above is indeed the right thing to do for samples!
(Oracle must do something else as a convenience for programmers who
don't write code that checks for a sample size of at least two.)
What's really interesting to me is that StarOffice 6.0's spreadsheet
functions,
both the standard deviation of a sample (=STDEV) and variance of a sample
(=VAR) are bug-for-bug compatible with Excel 2002! That is, Excel has a
bug, and StarOffice has the same bug to be compatible with Excel's bug.
I assume the functions are buggy in OpenOffice as well, but I haven't
checked.
For example, both of these calculations produce answers of 0 (zero)
but they should produce answers of 1 (one):
=stdev(80000000,80000001,80000002)
=var(80000000,80000001,80000002)
When the numbers are smaller, like this:
=stdev(0,1,2)
=var(0,1,2)
They produce correct answers.
douglas "trying to exploit the R and PostgreSQL synergy" trainor