Re: Do we need multiple forms of the SQL2003 statistics - Mailing list pgsql-hackers

From Sergey E. Koposov
Subject Re: Do we need multiple forms of the SQL2003 statistics
Date
Msg-id Pine.LNX.4.64.0607302205330.30743@lnfm1.sai.msu.ru
Whole thread Raw
In response to Re: Do we need multiple forms of the SQL2003 statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, 29 Jul 2006, Tom Lane wrote:

> Peter Eisentraut <peter_e@gmx.net> writes:
>> ... From a mathematician's point of view, however, some of these
>> functions normally produce irrational numbers anyway, so it seems
>> unlikely that numeric will be useful.  But looking at the definition
>> of, say, regr_avgx(Y, X), if all the input values are integers, it
>> might be useful if I could get an exact integer or rational number as
>> output, instead of a float, that is.
>
> The question is whether this is useful enough to justify adding a
> thousand lines to numeric.c.
>
> I believe also that the numeric forms of the aggregates will be
> enormously slower than the float forms, at least on most modern
> machines with decent floating-point performance.  I don't have time
> to do any performance testing though.
>

Based on just simple tests On my laptop (1.8Ghz Centrino) the numeric 
versions are 5-10 times slower (depending on what 2arg function is used)

The example:

template1=# \d  test          Table "public.test" Column |       Type       | Modifiers
--------+------------------+----------- na     | numeric          | nb     | numeric          | da     | double
precision| db     | double precision |
 
template1=# select count(*) from test;  count
--------- 1000000
(1 row)

template1=# select regr_avgx(da,db) from test;    regr_avgx
----------------- 0.5002412120227
(1 row)

Time: 1052,893 ms
template1=# select regr_avgx(na,nb) from test;        regr_avgx
------------------------- 0.500040167263887822939
(1 row)

Time: 4459,739 ms

template1=# select regr_sxx(da,db) from test;     regr_sxx
------------------ 83303.6317359119
(1 row)

Time: 1043,891 ms
template1=# select regr_sxx(na,nb) from test;                     regr_sxx
-------------------------------------------------- 83342.044294954140912267902323775495680113567986
(1 row)

Time: 8514,843 ms
template1=# select corr(da,db) from test;         corr
---------------------- 0.000527588261283456
(1 row)

Time: 1074,948 ms
template1=# select corr(na,nb) from test;                                          corr
----------------------------------------------------------------------------------------

0.000759857150984988517883855238363403977440313567465424735082001422354119457463407737
(1 row)

Time: 18327,376 ms


Yes, that's significantly slower, but I don't think that it make the 
functions unusable...

Regards,    Sergey

*******************************************************************
Sergey E. Koposov
Max Planck Institute for Astronomy/Sternberg Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: math@sai.msu.ru


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] 8.2 features?
Next
From: Joe Conway
Date:
Subject: Re: [PATCHES] 8.2 features?