Re: Calculating Median value - Mailing list pgsql-novice

From Tom Lane
Subject Re: Calculating Median value
Date
Msg-id 23810.1525630427@sss.pgh.pa.us
Whole thread Raw
In response to Re: Calculating Median value  (Chandru Aroor <caroor@yahoo.com>)
List pgsql-novice
Chandru Aroor <caroor@yahoo.com> writes:
>  Thanks Tom.  You are actually correct. In talking with the business user, we need to get the 50th percentile, and
pickingthe higher value for even number of members,  so median as Excel calculates it is actually not appropriate. Will
aggregatepercentile_cont, or possibly percentile_disc do it? 

If you don't want to interpolate between the two middle values, you must
use percentile_disc not percentile_cont --- that's exactly what the
difference is between those two functions.

I think though that percentile_disc picks the first not the second of the
two middle values.  You can fix that by reversing the sort order, so it'd
look like

SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY x DESC) FROM ...

Here's some examples;

regression=# SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY x) FROM (VALUES (1),(3),(7),(9)) v(x);
 percentile_disc
-----------------
               3
(1 row)

regression=# SELECT percentile_disc(0.5) WITHIN GROUP (ORDER BY x DESC) FROM (VALUES (1),(3),(7),(9)) v(x);
 percentile_disc
-----------------
               7
(1 row)

regression=# SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY x) FROM (VALUES (1),(3),(7),(9)) v(x);
 percentile_cont
-----------------
               5
(1 row)


            regards, tom lane


pgsql-novice by date:

Previous
From: Chandru Aroor
Date:
Subject: Re: Calculating Median value
Next
From: "Amit S."
Date:
Subject: Re: Postgres warm standby with delay