Thread: BUG #14008: corr, covar_pop function returns different values on the same dataset

The following bug has been logged on the website:

Bug reference:      14008
Logged by:          Stepan
Email address:      Stepya@ukr.net
PostgreSQL version: 9.5.1
Operating system:   RedHat Linux 64-bit
Description:

Aggregation functions line corr, covar_pop returns different values
depending on sorting order of the dataset. The issue is reproducible on
PostgreSQL version 9.5.0 and 9.5.1.

See script how to reproduce it.

  drop table if exists t ;
  drop table if exists t2 ;

   create table T
   (pk bigint,
    fk bigint);

    insert into T select generate_series as PK,
floor(random()*(567890-10)+10) as FK
    from generate_series(551475,551475+1000000);

-- Create table with the same data but different sorting order
create table T2
    as select *
    from T order by 1 desc;


    select count(1), stddev(PK), stddev(FK), covar_pop(pk, fk)
    from T
    UNION ALL
    select count(1), stddev(PK), stddev(FK), covar_pop(pk, fk)
    from T2;
-- Last column only returns different values!!!!


-- To be sure both tables contain the same data.
    select pk, fk
    from T
    except
    select pk, fk
    from T2;
--Returns 0 rows

    select pk, fk
    from T2
    except
    select pk, fk
    from T;
--Returns 0 rows


Stepan.
Stepya@ukr.net writes:
> Aggregation functions line corr, covar_pop returns different values
> depending on sorting order of the dataset.

AFAICS, that's just rounding error, which is inevitable when working
with floating-point data.

            regards, tom lane