Re: stddev returns 0 when there is one row - Mailing list pgsql-general

From Murthy Kambhampaty
Subject Re: stddev returns 0 when there is one row
Date
Msg-id 2D92FEBFD3BE1346A6C397223A8DD3FC0921FA@THOR.goeci.com
Whole thread Raw
In response to stddev returns 0 when there is one row  (Bruno Wolff III <bruno@wolff.to>)
List pgsql-general
FWIW, it seems that, at least in the long run, the variance(expression) and
stddev(expression) functions, should take two optional arguments, one that
specifies whether to return the sample or population measure, and the other
that identifies the count() column, and the processing would go like this
(pardon my Bash-ism):

variance(X,ARG1,ARG2) () {
    if [[ ARG2 != NULL ]]; then
        SUMX2 = sum( X**2 *"ARG2");
        SUMX    = sum( X *"ARG2");
        N = sum( "ARG2");
    else
        SUMX2 = sum( X**2 );
        SUMX    = sum( X );
        N = count( X);
    if

    if [[ ARG1 = 0 ]]; then # let 0 specify the sample measure
        varX = (SUMX2 - SUMX *SUMX/N)/(N -1) #there was a typo in my
earlier msg: missing "/sum( freq)" in the numerator
    elif [[ ARG1 = 1 ]]; then
        varX = (SUMX2 - SUMX *SUMX/N)/(N)
    fi
}

It's easy enough to do client side, for now.
    Murthy

PS: When you said, in response to Joe Conway's post re the SQL200x spec,
that you'd make the changes, I read it to mean you were going to implement
both the sample and population measures, but on second reading I think you
were just been saying that you'd make the existing function properly return
the sample standard deviation - NULL if only 1 row given?

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, April 21, 2003 12:21
To: Murthy Kambhampaty
Cc: Joe Conway; Manfred Koizar; Bruno Wolff III;
pgsql-general@postgresql.org
Subject: Re: [GENERAL] stddev returns 0 when there is one row


Murthy Kambhampaty <murthy.kambhampaty@goeci.com> writes:
> Tom, would it be too much trouble to program these in (as STDEV_POP_G and
> STDEV_SMPL_G or some variant?

I'm not volunteering, no.  (For starters, I think you'd need to upgrade
the aggregate code to allow multiple-input aggregates.  Which is worth
doing, but I do not expect to have time to think about such things for
7.4.)

            regards, tom lane


pgsql-general by date:

Previous
From: Joe Conway
Date:
Subject: Re: Using TEXT columns for binary content
Next
From: Oliver Elphick
Date:
Subject: Re: Documentation