Thread: Top 3 values for each group in PGSQL

Top 3 values for each group in PGSQL

From
"Ila B."
Date:
Hello,

I’m working on a health database and I’m trying to extract the most popular prescription codes from a custom table I
structuredlike this: 

Year - Code - Count(code)

I want to extract the 3 codes with maximum count for each year. I know I should be using rank() but I don’t really
understandhow this works. 
I am using pgAdmin4 version 3.5 with PostgreSQL 10.6 on Windows 10 Pro and no permission to update.

Thank you in advance,
Ilaria

Re: Top 3 values for each group in PGSQL

From
hubert depesz lubaczewski
Date:
On Fri, Mar 01, 2019 at 11:51:24AM +0100, Ila B. wrote:
> Hello,
> 
> I’m working on a health database and I’m trying to extract the most popular prescription codes from a custom table I
structuredlike this:
 
> 
> Year - Code - Count(code)
> 
> I want to extract the 3 codes with maximum count for each year. I know I should be using rank() but I don’t really
understandhow this works.
 
> I am using pgAdmin4 version 3.5 with PostgreSQL 10.6 on Windows 10 Pro and no permission to update.

Consider reading
https://www.depesz.com/2012/10/05/getting-top-n-rows-per-group/

Best regards,

depesz



Re: Top 3 values for each group in PGSQL

From
Thomas Kellerer
Date:
Ila B. schrieb am 01.03.2019 um 11:51:
> Hello,
> 
> I’m working on a health database and I’m trying to extract the most popular prescription codes from a custom table I
structuredlike this:
 
> 
> Year - Code - Count(code)
> 
> I want to extract the 3 codes with maximum count for each year. I know I should be using rank() but I don’t really
understandhow this works.
 
> I am using pgAdmin4 version 3.5 with PostgreSQL 10.6 on Windows 10 Pro and no permission to update.

Something along the lines:

select code, year, "count"
from (
   select code, year, "count", 
          dense_rank() over (partition by code, year order by "count" desc) as rnk
   from the_table
) t
where rnk <= 3;



Re: Top 3 values for each group in PGSQL

From
Achilleas Mantzios
Date:
On 1/3/19 2:52 μ.μ., Thomas Kellerer wrote:
> Ila B. schrieb am 01.03.2019 um 11:51:
>> Hello,
>>
>> I’m working on a health database and I’m trying to extract the most popular prescription codes from a custom table I
structuredlike this:
 
>>
>> Year - Code - Count(code)
>>
>> I want to extract the 3 codes with maximum count for each year. I know I should be using rank() but I don’t really
understandhow this works.
 
>> I am using pgAdmin4 version 3.5 with PostgreSQL 10.6 on Windows 10 Pro and no permission to update.
> Something along the lines:
>
> select code, year, "count"
> from (
>     select code, year, "count",
>            dense_rank() over (partition by code, year order by "count" desc) as rnk
>     from the_table
> ) t
> where rnk <= 3;
Yup, that's the idea
select qryout.* FROM (select qry.*,dense_rank() OVER (ORDER BY count DESC) FROM (select id_1,year_built,count(*) from
vesselsWHERE year_built IS NOT NULL AND year_built<>'' group by id_1,year_built 
 
ORDER BY COUNT(*) DESC) as qry ) qryout WHERE dense_rank<=3;
  id_1 | year_built | count | dense_rank
------+------------+-------+------------
    94 | 2009       |    11 |          1
    97 | 2010       |    10 |          2
    94 | 2011       |    10 |          2
    94 | 1975       |     9 |          3
    94 | 1976       |     9 |          3
(5 rows)

or

select qryout.* FROM (select qry.*,dense_rank() OVER (ORDER BY count DESC) FROM (select distinct
id_1,year_built,count(*)OVER (partition by id_1,year_built) from vessels WHERE year_built IS NOT NULL 
 
AND year_built<>'') as qry ) qryout WHERE dense_rank<=3;
  id_1 | year_built | count | dense_rank
------+------------+-------+------------
    94 | 2009       |    11 |          1
    97 | 2010       |    10 |          2
    94 | 2011       |    10 |          2
    94 | 1975       |     9 |          3
    94 | 1976       |     9 |          3
(5 rows)


>


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt