Thread: group by range of values
Hello, Here's my table: db=# select * from tester order by birth_year; birth_year | salary ------------+-------- 1946 | 78000 1949 | 61000 1951 | 58000 1953 | 56000 1958 | 52000 1962 | 50000 1965 | 45000 1967 | 60000 1968 | 57000 1970 | 47000 1972 | 32000 1973 | 42000 (12 rows) How can I display the average salary grouped by decade of birth year? That is, is it possible to display the average salary of those born in the 1940's, the average salary of those born in the 1950's, average salary of those born in the 1960's, and those born in the 1970's, all in one result table? Something like: decade | average(salary) -------+----------------- 1940 | 69500 1950 | 53333.33 1960 | 53000 1970 | 40333.33 Thanks in advance, C
Carol Cheung wrote: > Hello, > > Here's my table: > > db=# select * from tester order by birth_year; > birth_year | salary > ------------+-------- > 1946 | 78000 > 1949 | 61000 > 1951 | 58000 > 1953 | 56000 > 1958 | 52000 > 1962 | 50000 > 1965 | 45000 > 1967 | 60000 > 1968 | 57000 > 1970 | 47000 > 1972 | 32000 > 1973 | 42000 > (12 rows) > > How can I display the average salary grouped by decade of birth year? > That is, is it possible to display the average salary of those born in > the 1940's, the average salary of those born in the 1950's, average > salary of those born in the 1960's, and those born in the 1970's, all in > one result table? > Something like: > > decade | average(salary) > -------+----------------- > 1940 | 69500 > 1950 | 53333.33 > 1960 | 53000 > 1970 | 40333.33 > Assuming birth_year is an integer (if it's not, then just change the query to cast it to one before the division), one possible approach might be: select birth_year / 10 || '0' as decade, avg(salary::numeric) as average_salary from tester group by decade order by decade asc; -Jon -- Senior Systems Developer Media Matters for America http://mediamatters.org/
Hello you can use CASE like SELECT CASE WHEN birth_year BETWEEN 1940 AND 1949 THEN 1940 WHEN birth_year BETWEEN 1950 AND 1959 THEN 1950 WHEN birth_yearBETWEEN 1960 AND 1969 THEN 1960 WHEN birth_year BETWEEN 1970 AND 1979 THEN 1979 END, AVG(salary) FROM tester GROUPBY 1 ORDER BY 1; Regards Pavel Stehule 2007/7/27, Carol Cheung <cacheung@consumercontact.com>: > Hello, > > Here's my table: > > db=# select * from tester order by birth_year; > birth_year | salary > ------------+-------- > 1946 | 78000 > 1949 | 61000 > 1951 | 58000 > 1953 | 56000 > 1958 | 52000 > 1962 | 50000 > 1965 | 45000 > 1967 | 60000 > 1968 | 57000 > 1970 | 47000 > 1972 | 32000 > 1973 | 42000 > (12 rows) > > How can I display the average salary grouped by decade of birth year? > That is, is it possible to display the average salary of those born in > the 1940's, the average salary of those born in the 1950's, average > salary of those born in the 1960's, and those born in the 1970's, all in > one result table? > Something like: > > decade | average(salary) > -------+----------------- > 1940 | 69500 > 1950 | 53333.33 > 1960 | 53000 > 1970 | 40333.33 > > Thanks in advance, > C > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
On 7/27/07, Carol Cheung <cacheung@consumercontact.com> wrote: > Something like: > > decade | average(salary) > -------+----------------- > 1940 | 69500 > 1950 | 53333.33 > 1960 | 53000 > 1970 | 40333.33 CREATE TABLE tester ( birth_year integer, salary numeric(10,2) ); SELECT SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' AS DECADE , AVG(SALARY) AS AVG_SALARY FROM TESTER GROUP BY SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' ORDER BY DECADE;
On 7/27/07, I wrote: > On 7/27/07, Carol Cheung <cacheung@consumercontact.com> wrote: > > Something like: > > > > decade | average(salary) > > -------+----------------- > > 1940 | 69500 > > 1950 | 53333.33 > > 1960 | 53000 > > 1970 | 40333.33 > > CREATE TABLE tester ( > birth_year integer, > salary numeric(10,2) > ); > > SELECT > SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' > AS DECADE > , AVG(SALARY) AS AVG_SALARY > FROM TESTER > GROUP BY > SUBSTRING(TO_CHAR(BIRTH_YEAR,'FM9999') FROM 1 FOR 3)||'0' > ORDER BY DECADE; A bit simpler, if year is int: SELECT BIRTH_YEAR/10*10 AS DECADE , AVG(SALARY) AS AVG_SALARY FROM TESTER GROUP BY BIRTH_YEAR/10*10 ORDER BY DECADE;
2007/7/27, Carol Cheung <cacheung@consumercontact.com>: > db=# select * from tester order by birth_year; > birth_year | salary > ------------+-------- > 1946 | 78000 > 1949 | 61000 What is the data type of the birth_year column? I'd suggest using date if you can, as what it is is a date with year precision. You can't specify such a precision, but you can decide that all birth_year's will have month and year of January 1 (enforced by a CHECK constraint, if you wish), or you could just choose to ignore the month and year part in your calculations. > How can I display the average salary grouped by decade of birth year? > That is, is it possible to display the average salary of those born in > the 1940's, the average salary of those born in the 1950's, average > salary of those born in the 1960's, and those born in the 1970's, > all in > one result table? > Something like: > > decade | average(salary) > -------+----------------- > 1940 | 69500 > 1950 | 53333.33 > 1960 | 53000 > 1970 | 40333.33 Here's an example: CREATE TABLE salaries (birth_year DATE PRIMARY KEY, salary NUMERIC NOT NULL); INSERT INTO salaries (birth_year, salary) VALUES ('1946-01-01',78000), ('1949-01-01',61000), ('1951-01-01',58000) ,('1953-01-01',56000), ('1958-01-01',52000), ('1962-01-01',50000) , ('1965-01-01',45000), ('1967-01-01',60000), ('1968-01-01',57000) , ('1970-01-01',47000), ('1972-01-01',32000), ('1973-01-01',42000); SELECT birth_decade, AVG(salary) FROM ( SELECT birth_year , date_trunc('decade', birth_year)::date as birth_decade , salary FROM salaries)as salaries_with_decades GROUP BY birth_decade ORDER BY birth_decade; birth_decade | avg --------------+-------------------- 1940-01-01 | 69500.000000000000 1950-01-01 | 55333.333333333333 1960-01-01 | 53000.000000000000 1970-01-01 | 40333.333333333333 (4 rows) If birth_year is an integer column, here's another way to do it, taking advantage of the fact that integer division truncates. CREATE TABLE salaries (birth_year INTEGER PRIMARY KEY, salary NUMERIC NOT NULL); INSERT INTO salaries (birth_year, salary) VALUES (1946,78000), (1949,61000), (1951,58000), (1953,56000), (1958,52000) , (1962,50000), (1965,45000), (1967,60000), (1968,57000), (1970,47000) , (1972,32000), (1973,42000); SELECT birth_decade, AVG(salary) FROM ( SELECT birth_year , birth_year / 10 * 10 as birth_decade , salary FROM salaries) as salaries_with_decades GROUP BY birth_decade ORDER BY birth_decade; birth_decade | avg --------------+-------------------- 1940 | 69500.000000000000 1950 | 55333.333333333333 1960 | 53000.000000000000 1970 | 40333.333333333333 (4 rows) Hope this gives you some options. Michael Glaesemann grzm seespotcode net