Hello,
I wrote article about statistical function - when I tested Joe Celko's
method, I found some problems on not unique dataset:
on distinct dataset is rule so rows here is max(hi), then there is min(lo):
create table x1 (a integer);
insert into x1 select generate_series(1,10);
postgres=# select row_number() over (order by a), row_number() over
(order by a desc) from x1;row_number | row_number
------------+------------ 10 | 1 9 | 2 8 | 3 7 | 4
6 | 5 5 | 6 4 | 7 3 | 8 2 | 9
1| 10
(10 rows)
but on other set I got
truncate table x1;
insert into x1 values(2),(2),(3),(3),(4),(4),(5),(5),(6),(6),(6),(8),(9),(9),(10),(10);
postgres=# select row_number() over (order by a), row_number() over
(order by a desc) from x1;row_number | row_number
------------+------------ 16 | 1 15 | 2 14 | 3 11 | 4
13 | 5 12 | 6 9 | 7 10 | 8 7 | 9
8| 10 5 | 11 6 | 12 4 | 13 3 | 14 1 |
15 2 | 16
(16 rows)
I am not sure, is this correct? When this solution is correct, then
Joe Celko's method for median calculation is buggy.
Regards
Pavel Stehule