Home > mailing lists

Re: Weighted Stats - Mailing list pgsql-hackers

From	Haribabu Kommi
Subject	Re: Weighted Stats
Date	January 8, 2016 08:38:02
Msg-id	CAJrrPGf1KT-5BN+RPuNXMkCiGXNQs5_RatYqseun-xfF9H+8aQ@mail.gmail.com Whole thread Raw
In response to	Re: Weighted Stats (David Fetter <david@fetter.org>)
Responses	Re: Weighted Stats (David Fetter <david@fetter.org>) Re: Weighted Stats (David Fetter <david@fetter.org>)
List	pgsql-hackers

Tree view

On Mon, Dec 21, 2015 at 1:50 PM, David Fetter <david@fetter.org> wrote:
> On Sun, Dec 20, 2015 at 06:13:33PM -0600, Jim Nasby wrote:
>> On 11/2/15 5:46 PM, David Fetter wrote:
>> >I'd like to add weighted statistics to PostgreSQL
>>
>> Anything happen with this? If community isn't interested, ISTM it'd be good
>> to put this in PGXN.
>
> I think it's already in PGXN as an extension, and I'll get another
> version out this early this week, as it involves mostly adding some
> tests.
>
> I'll do the float8 ones for core this week, too, and unless there's a
> really great reason to do more data types on the first pass, it should
> be in committable shape.

I reviewed the patch, following are my observations.

1. +       precision</type>, <type>numeric</type>, or <type>interval</type>

with interval type it is giving problem. As interval data type is not supported,
so remove it in the list of supported inputs.

postgres=# select weighted_avg(f7,f1) from tbl;
ERROR:  function weighted_avg(interval, smallint) does not exist at character 8
HINT:  No function matches the given name and argument types. You
might need to add explicit type casts.

2. +float8_weighted_avg(PG_FUNCTION_ARGS)

It will be helpful, if you provide some information as a function header,
how the weighted average is calculated similar like other weighted functions.

3. + transvalues = check_float8_array(transarray,
"float8_weighted_stddev_accum", 4);

The second parameter to check_float8_array should be "float8_weighted_accum".

4. There is an OID conflict of 4066 with latest master code.

5.+ A += newvalW * ( newvalX - transvalues[2] ) / W;
+ CHECKFLOATVAL(A, isinf(newvalW) || isinf(newvalX - transvalues[2])
|| isinf(1.0/W), true);

+ Q += newvalW * (newvalX - transvalues[2]) * (newvalX - A);
+ CHECKFLOATVAL(A, isinf(newvalX -  transvalues[3]) || isinf(newvalX -
A) || isinf(1.0/W), true);

Is the need of calculation also needs to be passed to CHECKFLOATVAL?
Just passing
the variables involved in the calculation isn't enough? If expressions
are required then
it should be something as follows?

CHECKFLOATVAL(A, isinf(transvalues[2]) || isinf(newvalW) ||
isinf(newvalX - transvalues[2]) || isinf(1.0/W), true);

CHECKFLOATVAL(Q, isinf(transvalues[3]) || isinf(newvalX -
transvalues[2]) || isinf(newvalX - A) || isinf(1.0/W), true);

I verified the stddev transition and final function calculations
according to wikipedia
and they are fine.

Regards,
Hari Babu
Fujitsu Australia

pgsql-hackers by date:

From: Peter Geoghegan
Date: 08 January 2016, 08:11:30
Subject: Re: Generalizing SortSupport for text to work with char(n), bytea, and alternative opclasses

From: Catalin Iacob
Date: 08 January 2016, 09:32:12
Subject: Re: proposal: PL/Pythonu - function ereport

Re: Weighted Stats - Mailing list pgsql-hackers

Previous

Next