2008/11/5 Vladimir Sitnikov <sitnikov.vladimir@gmail.com>:
>
>> Quoted from SQL:2008
>> "If CUME_DIST is specified, then the relative rank of a row R is defined
>> as
>> NP/NR, where NP is defined
>> to be the number of rows preceding or peer with R in the window ordering
>> of
>> the window partition of R
>> and NR is defined to be the number of rows in the window partition of R."
>>
> I guess there is a difference between "row_number" and "number of rows
> preceding or peer with R"
>
> "number of rows preceding or peer with R" == count(*) over (order by salary)
>
> As far as I understand, the following query should calculate cume_dist
> properly (and it does so in Oracle):
>
> SELECT name,CAST(r AS FLOAT) / c, cd
> FROM (SELECT name,
> COUNT(*) OVER(ORDER BY salary) as r,
> COUNT(*) OVER() AS c,
> CUME_DIST() OVER(ORDER BY salary) AS cd
> FROM employees
> ) t;
>
I'm afraid I misinterpreted it. As you say,
"number of rows preceding == row_number()"
and
"rumber of rows preceding or peers to R != row_number() (neither rank())"
"peers to R" in the window function context means "same rows by the
ORDER BY clause", so in the first example, id=5 and id=6 are peers and
in both rows, NP should be 6, as Oracle and Sybase say.
Even though I understand the definition, your suggestion of COUNT(*)
OVER (ORDER BY salary) doesn't make sense. In the patch, it simply
returns the same value as row_number() but is it wrong, too?
Regards,
--
Hitoshi Harada