Thread: FYI: geometric means in one step without custom functions

FYI: geometric means in one step without custom functions

From
Andrew Gould
Date:
A long time ago, I emailed this list about calculating
a geometric mean in PostgreSQL.  Creating a custom
function didn't work because the process of
multiplying the values from each record resulted in
numbers that exceeded the limits for the size of a
number very quickly when dealing with large
populations.

I have learned, since, that you can achieve the same
end by replacing certain steps with log functions.
(Someone who is very good at math showed me this -- I
just tested the results and wrote the sql.)  This
method has 2 great benefits:

1. The method pushes the limits of deriving geometric
mean calculations considerably.
2. The default installation of PostgreSQL has
everything needed to perform the calculation.

The sql statement below calculates the geometric mean
of the lengths of stay (gm_los) for patients, grouped
by diagnostic related group and fiscal year.

The population (cases) and average length of stay
(avg_los) are also reported.

Note 1. Make sure you are calculating geometric mean
on a data type that has values to the right of the
decimal point.

Note 2. You cannot use a log function on a value <= 0.
Thus, I filtered for los > 0.

select drg_no, fy, count(pt_id) as cases,
avg(los) as avg_los,
exp(sum(ln(los::real)) * (1.0/count(pt_id))) as gm_los

from case_current where los > 0
group by drg_no, fy;

Have fun!

Andrew Gould

Re: FYI: geometric means in one step without custom functions

From
"Vincent Hikida"
Date:
This is a great technique.  It is especially useful in finance for
compounded interest for problems like the following

total return = ((1+janReturn)(1+febReturn)(1+marReturn))-1

I first learned it from an MBA in finance when I was looking over a
spreadsheet that she wrote.

Vincent Hikida,
Member of Technical Staff - Urbana Software, Inc.
"A Personalized Learning Experience"

www.UrbanaSoft.com

----- Original Message -----
From: "Andrew Gould" <andrewgould@yahoo.com>
To: "Postgres Mailing List" <pgsql-general@postgresql.org>
Sent: Sunday, July 06, 2003 8:38 AM
Subject: [GENERAL] FYI: geometric means in one step without custom functions


> A long time ago, I emailed this list about calculating
> a geometric mean in PostgreSQL.  Creating a custom
> function didn't work because the process of
> multiplying the values from each record resulted in
> numbers that exceeded the limits for the size of a
> number very quickly when dealing with large
> populations.
>
> I have learned, since, that you can achieve the same
> end by replacing certain steps with log functions.
> (Someone who is very good at math showed me this -- I
> just tested the results and wrote the sql.)  This
> method has 2 great benefits:
>
> 1. The method pushes the limits of deriving geometric
> mean calculations considerably.
> 2. The default installation of PostgreSQL has
> everything needed to perform the calculation.
>
> The sql statement below calculates the geometric mean
> of the lengths of stay (gm_los) for patients, grouped
> by diagnostic related group and fiscal year.
>
> The population (cases) and average length of stay
> (avg_los) are also reported.
>
> Note 1. Make sure you are calculating geometric mean
> on a data type that has values to the right of the
> decimal point.
>
> Note 2. You cannot use a log function on a value <= 0.
> Thus, I filtered for los > 0.
>
> select drg_no, fy, count(pt_id) as cases,
> avg(los) as avg_los,
> exp(sum(ln(los::real)) * (1.0/count(pt_id))) as gm_los
>
> from case_current where los > 0
> group by drg_no, fy;
>
> Have fun!
>
> Andrew Gould
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>