Re: Weighted Stats - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Weighted Stats
Date
Msg-id CAMkU=1y45wFL72-HkFo9SDR=Gont9qxR3=H+JzuH7=ok=PQb7w@mail.gmail.com
Whole thread Raw
In response to Re: Weighted Stats  (David Fetter <david@fetter.org>)
Responses Re: Weighted Stats  (David Fetter <david@fetter.org>)
Re: Weighted Stats  (David Fetter <david@fetter.org>)
List pgsql-hackers
On Tue, Mar 15, 2016 at 8:36 AM, David Fetter <david@fetter.org> wrote:
>
> Please find attached a patch that uses the float8 version to cover the
> numeric types.

Is there a well-defined meaning for having a negative weight?  If no,
should it be disallowed?

I don't know what I was expecting,  but not this:

select weighted_avg(x,10000000-2*x) from generate_series(1,10000000) f(x);  weighted_avg
------------------16666671666717.1


Also, I think it might not give the correct answer even without
negative weights:

create table foo as select floor(random()*10000)::int val from
generate_series(1,10000000);

create table foo2 as select val, count(*) from foo group by val;

Shouldn't these then give the same result:

select stddev_samp(val) from foo;   stddev_samp
-------------------2887.054977297105

select weighted_stddev_samp(val,count) from foo2;weighted_stddev_samp
----------------------    2887.19919651336

The 5th digit seems too early to be seeing round-off error.

Cheers,

Jeff



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Next
From: Tomas Vondra
Date:
Subject: incorrect docs for pgbench / skipped transactions