Thread: Re: [BUGS] agregate function sum error

Re: [BUGS] agregate function sum error

From
sferac@bo.nettuno.it
Date:
On Sun, 15 Feb 1998, Marin D wrote:

>
> Hi!
>
> You can try to typecast to float8
>
> I have no idea how MySql is doing SUM() - probably it uses unsigned int
> for SUM() so this gives u twice bigger upper limit - try it with MySql
> with sum exceeding 2^32. I suppose the result will be negative ...
>

Well, I asked this question to MySql mailing list:

> sferac> What's meaning "inf" as a result during a SUM() ?
> sferac> Is it an overflow warning, and if so what's the max range of SUM()
> sferac> before to have an overflow ?
>
> mysql> update t set myfloat=myfloat*9.1;
> sferac> Query OK, 1415 rows affected (4.94 sec)
>
> mysql> select sum(myfloat) from t;
> sferac> 1 row in set (0.04 sec)
>
> sferac> +--------------+
> sferac> | sum(myfloat) |
> sferac> +--------------+
> sferac> |          Inf |
> sferac> +--------------+
>
> mysqld does all normal calculation with doubles (bit functions are
> done with longlong);  The range of a double is typical something like:
>
> #define       DBL_MAX         1.7976931348623157E+308
> #define       DBL_MIN         2.2250738585072014E-308
>
> Yours,
> Monty

Isn't it a good solution to implement PostgreSQL SUM() function ?
What do you think about ?

____________________________Jose' Soares Da Silva_____________________________
           Via Bazzanese, 69      ),,,(          www.sferacarta.com
    Casalecchio R.(BO) Italy     +|o o|+         sferac@bo.nettuno.it
        Fax. ++39 51 6131537      /. .\          Tel. ++39 51  591054
---------------------------------(_---_)--------------------------------------
"...when men cease to destroy the animal race the lion
 and the lamb can dwell together..." (Joseph Smith)


Re: [HACKERS] Re: [BUGS] agregate function sum error

From
"Thomas G. Lockhart"
Date:
> > You can try to typecast to float8
> >
> > I have no idea how MySql is doing SUM() - probably it uses unsigned int
> > for SUM() so this gives u twice bigger upper limit - try it with MySql
> > with sum exceeding 2^32. I suppose the result will be negative ...
> >
>
> Well, I asked this question to MySql mailing list:
>
> > sferac> What's meaning "inf" as a result during a SUM() ?
> > sferac> Is it an overflow warning, and if so what's the max range of SUM()
> > sferac> before to have an overflow ?
> >
> > mysql> update t set myfloat=myfloat*9.1;
> > sferac> Query OK, 1415 rows affected (4.94 sec)
> >
> > mysql> select sum(myfloat) from t;
> > sferac> 1 row in set (0.04 sec)
> >
> > sferac> +--------------+
> > sferac> | sum(myfloat) |
> > sferac> +--------------+
> > sferac> |          Inf |
> > sferac> +--------------+
> >
> > mysqld does all normal calculation with doubles (bit functions are
> > done with longlong);  The range of a double is typical something like:
> >
> > #define       DBL_MAX         1.7976931348623157E+308
> > #define       DBL_MIN         2.2250738585072014E-308

The limits are the same as for Postgres. The difference is that Postgres throws
an error on floating overflows, rather than returning "infinity". I can think of
arguments for both behaviors, but one problem is that at least one older platform
(vax) does not have IEEE arithmetic so can't work with "infinity" :(

We do not currently use "longlong" for any integers in the backend, because I
didn't know how this is implemented/supported for _every_ platform on our
supported list. I had put an "int8" package into the contrib directory to test
this, and to get feedback for other platforms, but have gotten only one response
and it was for a platform (alpha) which I already knew would work.

Perhaps I'll make a concerted effort to get this tested for v6.4; scrappy had
suggested just putting int8s into the backend and then working out the porting
issues and I think I'll do that :) Then, we can use int8 variables in other
places where necessary for v6.5...

                                                - Tom


Re: [HACKERS] Re: [BUGS] agregate function sum error

From
"Aleksandr A. Belinsky"
Date:



On Tue, 17 Feb 1998, Thomas G. Lockhart wrote:

> > > You can try to typecast to float8
> > >
> > > I have no idea how MySql is doing SUM() - probably it uses unsigned int
> > > for SUM() so this gives u twice bigger upper limit - try it with MySql
> > > with sum exceeding 2^32. I suppose the result will be negative ...
> > >
> >
> > Well, I asked this question to MySql mailing list:
> >
> > > sferac> What's meaning "inf" as a result during a SUM() ?
> > > sferac> Is it an overflow warning, and if so what's the max range of SUM()
> > > sferac> before to have an overflow ?
> > >
> > > mysql> update t set myfloat=myfloat*9.1;
> > > sferac> Query OK, 1415 rows affected (4.94 sec)
> > >
> > > mysql> select sum(myfloat) from t;
> > > sferac> 1 row in set (0.04 sec)
> > >
> > > sferac> +--------------+
> > > sferac> | sum(myfloat) |
> > > sferac> +--------------+
> > > sferac> |          Inf |
> > > sferac> +--------------+
> > >
> > > mysqld does all normal calculation with doubles (bit functions are
> > > done with longlong);  The range of a double is typical something like:
> > >
> > > #define       DBL_MAX         1.7976931348623157E+308
> > > #define       DBL_MIN         2.2250738585072014E-308
>
> The limits are the same as for Postgres. The difference is that Postgres throws
> an error on floating overflows, rather than returning "infinity". I can think of
> arguments for both behaviors, but one problem is that at least one older platform
> (vax) does not have IEEE arithmetic so can't work with "infinity" :(
>
> We do not currently use "longlong" for any integers in the backend, because I
> didn't know how this is implemented/supported for _every_ platform on our
> supported list. I had put an "int8" package into the contrib directory to test
> this, and to get feedback for other platforms, but have gotten only one response
> and it was for a platform (alpha) which I already knew would work.
>
> Perhaps I'll make a concerted effort to get this tested for v6.4; scrappy had
> suggested just putting int8s into the backend and then working out the porting
> issues and I think I'll do that :) Then, we can use int8 variables in other
> places where necessary for v6.5...
>
>                                                 - Tom
>
>

_______________________________________________________________________________
  Aleksandr Belinsky
     Sternberg  Astronomical Institute,  Moscow  University  (Russia)
         E-mail: aleks@sai.msu.su  http://zeus.sai.msu.su/~aleks/
              _____________________________________________


Re: [HACKERS] Re: [BUGS] agregate function sum error

From
sferac@bo.nettuno.it
Date:
On Tue, 17 Feb 1998, Thomas G. Lockhart wrote:

> > > You can try to typecast to float8
> > >
> > > I have no idea how MySql is doing SUM() - probably it uses unsigned int
> > > for SUM() so this gives u twice bigger upper limit - try it with MySql
> > > with sum exceeding 2^32. I suppose the result will be negative ...
> > >
> >
> > Well, I asked this question to MySql mailing list:
> >
> > > sferac> What's meaning "inf" as a result during a SUM() ?
> > > sferac> Is it an overflow warning, and if so what's the max range of SUM()
> > > sferac> before to have an overflow ?
> > >
> > > mysql> update t set myfloat=myfloat*9.1;
> > > sferac> Query OK, 1415 rows affected (4.94 sec)
> > >
> > > mysql> select sum(myfloat) from t;
> > > sferac> 1 row in set (0.04 sec)
> > >
> > > sferac> +--------------+
> > > sferac> | sum(myfloat) |
> > > sferac> +--------------+
> > > sferac> |          Inf |
> > > sferac> +--------------+
> > >
> > > mysqld does all normal calculation with doubles (bit functions are
> > > done with longlong);  The range of a double is typical something like:
> > >
> > > #define       DBL_MAX         1.7976931348623157E+308
> > > #define       DBL_MIN         2.2250738585072014E-308
>
> The limits are the same as for Postgres. The difference is that Postgres throws
> an error on floating overflows, rather than returning "infinity".

I think overflows on float8 is OK, the problem is about overflows on int4,
int2 and float4.
If PostgreSQL does all calculation using float8 (even on int2, int4 and float4)
we have acceptable overflows. Why we don't do this in this way ?

I can think of
> arguments for both behaviors, but one problem is that at least one older platform
> (vax) does not have IEEE arithmetic so can't work with "infinity" :(
>
> We do not currently use "longlong" for any integers in the backend, because I
> didn't know how this is implemented/supported for _every_ platform on our
> supported list. I had put an "int8" package into the contrib directory to test
> this, and to get feedback for other platforms, but have gotten only one response
> and it was for a platform (alpha) which I already knew would work.
>
> Perhaps I'll make a concerted effort to get this tested for v6.4; scrappy had
> suggested just putting int8s into the backend and then working out the porting
> issues and I think I'll do that :) Then, we can use int8 variables in other
> places where necessary for v6.5...
                                                            Ciao, Jose'


Re: [HACKERS] Re: [BUGS] agregate function sum error

From
Bruce Momjian
Date:
>
> On Tue, 17 Feb 1998, Thomas G. Lockhart wrote:
>
> > > > You can try to typecast to float8
> > > >
> > > > I have no idea how MySql is doing SUM() - probably it uses unsigned int
> > > > for SUM() so this gives u twice bigger upper limit - try it with MySql
> > > > with sum exceeding 2^32. I suppose the result will be negative ...
> > > >
> > >
> > > Well, I asked this question to MySql mailing list:
> > >
> > > > sferac> What's meaning "inf" as a result during a SUM() ?
> > > > sferac> Is it an overflow warning, and if so what's the max range of SUM()
> > > > sferac> before to have an overflow ?
> > > >
> > > > mysql> update t set myfloat=myfloat*9.1;
> > > > sferac> Query OK, 1415 rows affected (4.94 sec)
> > > >
> > > > mysql> select sum(myfloat) from t;
> > > > sferac> 1 row in set (0.04 sec)
> > > >
> > > > sferac> +--------------+
> > > > sferac> | sum(myfloat) |
> > > > sferac> +--------------+
> > > > sferac> |          Inf |
> > > > sferac> +--------------+
> > > >
> > > > mysqld does all normal calculation with doubles (bit functions are
> > > > done with longlong);  The range of a double is typical something like:
> > > >
> > > > #define       DBL_MAX         1.7976931348623157E+308
> > > > #define       DBL_MIN         2.2250738585072014E-308
> >
> > The limits are the same as for Postgres. The difference is that Postgres throws
> > an error on floating overflows, rather than returning "infinity".
>
> I think overflows on float8 is OK, the problem is about overflows on int4,
> int2 and float4.
> If PostgreSQL does all calculation using float8 (even on int2, int4 and float4)
> we have acceptable overflows. Why we don't do this in this way ?

Can you imagine the problems we would have?  Integral types and floats
are two different things.  They behave differently in division,
rounding, and overflows.  For those reasons, and performance, we can't
just use floats all the time.



--
Bruce Momjian
maillist@candle.pha.pa.us

Re: [HACKERS] Re: [BUGS] agregate function sum error

From
sferac@bo.nettuno.it
Date:
On Thu, 19 Feb 1998, Bruce Momjian wrote:

> >
> > On Tue, 17 Feb 1998, Thomas G. Lockhart wrote:
> >
> > > > > You can try to typecast to float8
> > > > >
> > > > > I have no idea how MySql is doing SUM() - probably it uses unsigned int
> > > > > for SUM() so this gives u twice bigger upper limit - try it with MySql
> > > > > with sum exceeding 2^32. I suppose the result will be negative ...
> > > > >
> > > >
> > > > Well, I asked this question to MySql mailing list:
> > > >
> > > > > sferac> What's meaning "inf" as a result during a SUM() ?
> > > > > sferac> Is it an overflow warning, and if so what's the max range of SUM()
> > > > > sferac> before to have an overflow ?
> > > > >
> > > > > mysql> update t set myfloat=myfloat*9.1;
> > > > > sferac> Query OK, 1415 rows affected (4.94 sec)
> > > > >
> > > > > mysql> select sum(myfloat) from t;
> > > > > sferac> 1 row in set (0.04 sec)
> > > > >
> > > > > sferac> +--------------+
> > > > > sferac> | sum(myfloat) |
> > > > > sferac> +--------------+
> > > > > sferac> |          Inf |
> > > > > sferac> +--------------+
> > > > >
> > > > > mysqld does all normal calculation with doubles (bit functions are
> > > > > done with longlong);  The range of a double is typical something like:
> > > > >
> > > > > #define       DBL_MAX         1.7976931348623157E+308
> > > > > #define       DBL_MIN         2.2250738585072014E-308
> > >
> > > The limits are the same as for Postgres. The difference is that Postgres throws
> > > an error on floating overflows, rather than returning "infinity".
> >
> > I think overflows on float8 is OK, the problem is about overflows on int4,
> > int2 and float4.
> > If PostgreSQL does all calculation using float8 (even on int2, int4 and float4)
> > we have acceptable overflows. Why we don't do this in this way ?
>
> Can you imagine the problems we would have?  Integral types and floats
> are two different things.  They behave differently in division,
> rounding, and overflows.  For those reasons, and performance, we can't
> just use floats all the time.
>
I see..., then what we need is int8 as Tom said, to avoid to write code as:

select cast mysmall as int4 * cast mysmall as int4 from mytable;
select mysmall::int4 * mysmall::int4 from mytable;
select int4(mysmall) * int4(mysmall) from mytable;
select sum(int4(mysmall)) from mytable;

that has nothing to do with SQL standard, and isn't too friendly,
those conversions should be done automaticly by PostgreSQL whitout
user's help as MySQL, Solid and other databases do.
-----------
PS: I installed int8 in my Linux box, but I have an error during insert time:

insert into qtest values('123','456');
ERROR:  fmgr_info: function 159745: cache lookup failed

insert into qtest values('123','4567890123456789');
ERROR:  fmgr_info: function 159745: cache lookup failed

insert into qtest values('4567890123456789','123');
ERROR:  fmgr_info: function 159745: cache lookup failed

insert into qtest values('4567890123456789','4567890123456789');
ERROR:  fmgr_info: function 159745: cache lookup failed

insert into qtest values('4567890123456789','-4567890123456789');
ERROR:  fmgr_info: function 159745: cache lookup failed

                                                            Ciao, Jose'


Re: [HACKERS] Re: [BUGS] agregate function sum error

From
"Thomas G. Lockhart"
Date:
> > > > The limits are the same as for Postgres. The difference is that Postgres throws
> > > > an error on floating overflows, rather than returning "infinity".
> > >
> > > I think overflows on float8 is OK, the problem is about overflows on int4,
> > > int2 and float4.

What would it take to have backends start signalling ERROR (not core dump :) on integer
overflows? It would be nice to have the compile-time option; perhaps it is already
somewhere in the backend.

The SQL92 standard requires an abort on floating overflows, though _just_ because the
standard says it doesn't mean we have to do it. We should do what makes sense. However,
once an IEEE value becomes "Inf" further math operations keep it at "Inf", which
propagates the problem farther into your calculation. Probably better to signal the
error and let the user fix the overflow problem first. However, if you want to generate
compile-time patches...

> > > If PostgreSQL does all calculation using float8 (even on int2, int4 and float4)
> > > we have acceptable overflows. Why we don't do this in this way ?
> > Can you imagine the problems we would have?  Integral types and floats
> > are two different things.  They behave differently in division,
> > rounding, and overflows.  For those reasons, and performance, we can't
> > just use floats all the time.
> >
> I see..., then what we need is int8 as Tom said, to avoid to write code as:
>
> select cast mysmall as int4 * cast mysmall as int4 from mytable;
> select mysmall::int4 * mysmall::int4 from mytable;
> select int4(mysmall) * int4(mysmall) from mytable;
> select sum(int4(mysmall)) from mytable;
>
> that has nothing to do with SQL standard, and isn't too friendly,
> those conversions should be done automaticly by PostgreSQL whitout
> user's help as MySQL, Solid and other databases do.

Well, not all real databases do this, though I see your point. There is a performance
tradeoff between always promoting types when doing arithmetic and leaving the type
consistant. Also, "hidden" promotions may be less intuitive for some users, and lead to
unexpected behavior elsewhere.

> PS: I installed int8 in my Linux box, but I have an error during insert time:
> insert into qtest values('123','456');
> ERROR:  fmgr_info: function 159745: cache lookup failed

Hmm. Works on my machine (980217 cvs tree):

create table qtest(q1 int8, q2 int8);
CREATE
...
insert into qtest values('123','4567890123456789');
INSERT 1018571 1
...
select * from qtest;
              q1|               q2
----------------+-----------------
             123|              456
             123| 4567890123456789
4567890123456789|              123
4567890123456789| 4567890123456789
4567890123456789|-4567890123456789
(5 rows)

Did you try from a clean database? How about from a clean install??

                                                  - Tom


Re: [HACKERS] Re: [BUGS] agregate function sum error

From
sferac@bo.nettuno.it
Date:
On Fri, 20 Feb 1998, Thomas G. Lockhart wrote:

> > > > > The limits are the same as for Postgres. The difference is that Postgres throws
> > > > > an error on floating overflows, rather than returning "infinity".
> > > >
> > > > I think overflows on float8 is OK, the problem is about overflows on int4,
> > > > int2 and float4.
>
> What would it take to have backends start signalling ERROR (not core dump :) on integer
> overflows? It would be nice to have the compile-time option; perhaps it is already
> somewhere in the backend.
>
> The SQL92 standard requires an abort on floating overflows, though _just_ because the
> standard says it doesn't mean we have to do it. We should do what makes sense. However,
> once an IEEE value becomes "Inf" further math operations keep it at "Inf", which
> propagates the problem farther into your calculation. Probably better to signal the
> error and let the user fix the overflow problem first. However, if you want to generate
> compile-time patches...
>
> > > > If PostgreSQL does all calculation using float8 (even on int2, int4 and float4)
> > > > we have acceptable overflows. Why we don't do this in this way ?
> > > Can you imagine the problems we would have?  Integral types and floats
> > > are two different things.  They behave differently in division,
> > > rounding, and overflows.  For those reasons, and performance, we can't
> > > just use floats all the time.
> > >
> > I see..., then what we need is int8 as Tom said, to avoid to write code as:
> >
> > select cast mysmall as int4 * cast mysmall as int4 from mytable;
> > select mysmall::int4 * mysmall::int4 from mytable;
> > select int4(mysmall) * int4(mysmall) from mytable;
> > select sum(int4(mysmall)) from mytable;
> >
> > that has nothing to do with SQL standard, and isn't too friendly,
> > those conversions should be done automaticly by PostgreSQL whitout
> > user's help as MySQL, Solid and other databases do.
>
> Well, not all real databases do this, though I see your point. There is a performance
> tradeoff between always promoting types when doing arithmetic and leaving the type
> consistant. Also, "hidden" promotions may be less intuitive for some users, and lead to
> unexpected behavior elsewhere.
>
Hi Tom,
Apologies for my insistence and for my bad english, I like to watch things
from point of view of end users.
(are databases for end users or programmers ??)
We programmers, often forget that end users don't want to know what's happening
behind their own screen, they don't want write ostrogothics sentences ala C
to query a database, they had learned SQL whith so much effort and now they
want to query database without thought about dificulties, that C find when
it need convert numbers from one type to another.
I thing that one of the bigest dificulty of PostgreSQL is how it treats numbers.
SQL standard says that a number is comparable to, and compatible with, all
other numbers (that is: all numbers are mutually comparable and mutually
assignable). I think it means directly, without force it with the CAST or
functions.
I can't imagine end users using things like this:
        UPDATE my_table SET my_small=int2(my_int4);
        SELECT SUM(float8(my_int) FROM my_table;
        SELECT mysmall::int4 * mysmall::int4 FROM my_table;

End users hate uneccessaries or non sense messages like the following
when he/she writes a sentence like this:

        UPDATE my_table SET my_int=my_small;

and PostgreSQL show him/her this message ?
---------------------------------------------------------------------
WARN: type of my_small does not match target column my_int
---------------------------------------------------------------------
- What message is this ? If PostgreSQL understand that I want to do:
   UPDATE my_table SET my_int=my_small::int4;
why it doesn't do that for me?

         SELECT my_small * my_float FROM my_table;
---------------------------------------------------------------------
NOTICE:there is no operator * for types int2 and float8
NOTICE:You will either have to retype this query using an
NOTICE:explicit cast, or you will have to define the operator
WARN:* for int2 and float8 using CREATE OPERATOR
---------------------------------------------------------------------

- Well, if PostgreSQL know what I want why it doesn't do it for me ?
I know that conversion is more expensive than warnning but it's more
friendly. What do you think? ;-)
                                                 Regards, Jose'


Re: [HACKERS] Re: [BUGS] agregate function sum error

From
"Thomas G. Lockhart"
Date:
> I thing that one of the bigest dificulty of PostgreSQL is how it treats numbers.
> SQL standard says that a number is comparable to, and compatible with, all
> other numbers (that is: all numbers are mutually comparable and mutually
> assignable). I think it means directly, without force it with the CAST or
> functions.

I agree that the numeric (and other) type coersions need some more work. One of the reasons
why we have been reluctant to embed _too_ many automatic coersions into the backend is that
it does not extend well into the type-extensible features of Postgres. But I hope to look at
this some for v6.4.

> and PostgreSQL show him/her this message ?
> WARN: type of my_small does not match target column my_int

this is an example of behavior which I think we _can_ change...

> - Well, if PostgreSQL know what I want why it doesn't do it for me ?
> I know that conversion is more expensive than warnning but it's more
> friendly. What do you think? ;-)

I agree up to a point. Once a user has chosen to use an "optimized data type" like "int2"
rather than "integer" or "int4" then they have also chosen to give up some capabilities of
the generic integer type such as dynamic range. However, some extra conversion stuff can and
should happen, and I have some ideas on how to do this. If any hacker wants to work on it,
let's start talking after the upcoming v6.3 release...

                                                      - Tom