On 11/1/06,
Raghuraman K <
raghuramank@aztecsoft.com> wrote:
Hi,
I have a table like this: create table(xyz char(10), answer number(4)). There are a lot of rows in this table. I am looking at a query that will help me represent the distribution of data records based on the column answer. For example, we may take that the highest entry for answer column is 90 and the lowest is 2 and there are 1000 records. I am looking at a query that will tell how the 1000 records are distributed between the highest and lowest answer (in this case between 90 and 2). Can anyone please help?
It helps to know what kind of distribution information you are after.
Mean:
select sum(number) / count(*) from xyz;
Median:
Check out this URL
http://72.14.203.104/search?q=cache:kvZMBQuoAbkJ:people.planetpostgresql.org/greg/index.php%3F/categories/13-Math+postgresql+median+mean+functions&hl=en&gl=us&ct=clnk&cd=1&client=firefox-aRange:
select max(number) - min(number) from xyz;
Population Variance:
select power(sum(number - mean), 2) / count(*)
from xyz
inner join (
select sum(number) / count(*) as mean from xyz
) as xyz_mean
;
Sample Variance:
select power(sum(number - mean), 2) / (count(*) - 1)
from xyz
inner join (
select sum(number) / count(*) as mean from xyz
) as xyz_mean
;
Note that I did not check the syntax for typos.
Anything more than this will require you whip out a Statistics book.
==================================================================
Aaron Bono
Aranya Software Technologies, Inc.
http://www.aranya.com http://codeelixir.com==================================================================