Thread: Standard Deviation function.
Hello, I really need a Standard Deviation aggregate function so I will try to write one. I know about the man pages for "create aggregate" and "create function". Is there something else I should look at? Just a few pointers could save me a few hours of hunting around. All advice accepted. It seems kind of hard to do with only two state functions unless I "cheat". I need to keep three values, Count, Sum, and Sum of Squares. I could use three static variables and have the final function ignore its input and use the static vars instead. This will likely blow up if the new Standard Deviation aggregate is used twice in the same select. Any hints or advice?? If someone has this done already let me know. I may want do a "median" aggregate function too as I'll need that later. This would require private storage and a sort. Could you cc me at both addresses below as I move around between them Thanks, -- --Chris Albertson chrisja@jps.net chris@topdog.logicon.com Voice: 626-351-0089 X127 Fax: 626-351-0699
> I really need a Standard Deviation aggregate function... > > I know about the man pages for "create aggregate" and "create > function". Is there something else I should look at? > > It seems kind of hard to do with only two state functions unless > I "cheat". I need to keep three values, Count, Sum, and Sum of > Squares. > > Any hints or advice?? I thought about this a long time ago and had an idea but never got around to trying to implement it. I was going to have some functions that worked on a structure of two doubles to track the sum and square instead of using only one simple type. darrenk
> > > I really need a Standard Deviation aggregate function... > > > > I know about the man pages for "create aggregate" and "create > > function". Is there something else I should look at? > > > > It seems kind of hard to do with only two state functions unless > > I "cheat". I need to keep three values, Count, Sum, and Sum of > > Squares. > > > > Any hints or advice?? > > I thought about this a long time ago and had an idea but never > got around to trying to implement it. I was going to have some > functions that worked on a structure of two doubles to track > the sum and square instead of using only one simple type. I remember talking about this to someone, and the problem is that you needed the average WHILE scanning through the table, which required two passes, which the aggregate system is not designed to do. I may be wrong on this, though. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > > I really need a Standard Deviation aggregate function... > > > > I thought about this a long time ago and had an idea but never > > got around to trying to implement it. I was going to have some > > functions that worked on a structure of two doubles to track > > the sum and square instead of using only one simple type. > > I remember talking about this to someone, and the problem is that you > needed the average WHILE scanning through the table, which required two > passes, which the aggregate system is not designed to do. I may be > wrong on this, though. I had asked you how to calculate this and the variance early last year. One (I think the variance) was two-pass because of the need for the average, but I thought the StdDev would work with the struct. Been a while and I still haven't configured #(*&^ FreeBSD ppp yet. darrenk
> I had asked you how to calculate this and the variance early last > year. One (I think the variance) was two-pass because of the need > for the average, but I thought the StdDev would work with the struct. Variance is just square of std. dev, no? > > Been a while and I still haven't configured #(*&^ FreeBSD ppp yet. Bummer. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > > > > I really need a Standard Deviation aggregate function... > > > > > > I thought about this a long time ago and had an idea but never > > > got around to trying to implement it. I was going to have some > > > functions that worked on a structure of two doubles to track > > > the sum and square instead of using only one simple type. > > > > I remember talking about this to someone, and the problem is that you > > needed the average WHILE scanning through the table, which required two > > passes, which the aggregate system is not designed to do. I may be > > wrong on this, though. > > I had asked you how to calculate this and the variance early last > year. One (I think the variance) was two-pass because of the need > for the average, but I thought the StdDev would work with the struct. > > Been a while and I still haven't configured #(*&^ FreeBSD ppp yet. The Perl Module "Statistics/Descriptive" has on the fly variance calculation. sub add_data { my $self = shift; ##Myself my $oldmean; my ($min,$mindex,$max,$maxdex); ##Take care of appending to an existing data set $min = (defined ($self->{min}) ? $self->{min} : $_[0]); $max = (defined ($self->{max}) ? $self->{max} : $_[0]); $maxdex = $self->{maxdex} || 0; $mindex = $self->{mindex} || 0; ##Calculate new mean, pseudo-variance, min and max; foreach (@_) { $oldmean = $self->{mean}; $self->{sum} += $_; $self->{count}++; if ($_ >= $max) { $max = $_; $maxdex = $self->{count}-1; } if ($_ <= $min) { $min = $_; $mindex = $self->{count}-1; } $self->{mean} += ($_ - $oldmean) / $self->{count}; $self->{pseudo_variance} += ($_ - $oldmean) * ($_ - $self->{mean}); } $self->{min} = $min; $self->{mindex} = $mindex; $self->{max} = $max; $self->{maxdex} = $maxdex; $self->{sample_range} = $self->{max} - $self->{min}; if ($self->{count} > 1) { $self->{variance} = $self->{pseudo_variance} / ($self->{count} -1); $self->{standard_deviation} = sqrt( $self->{variance}); } return 1; } -dg David Gould dg@illustra.com 510.628.3783 or 510.305.9468 Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612 "Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats." -- Howard Aiken