Thread: Re: [BUGS] agregate function sum error
On Sun, 15 Feb 1998, Marin D wrote: > > Hi! > > You can try to typecast to float8 > > I have no idea how MySql is doing SUM() - probably it uses unsigned int > for SUM() so this gives u twice bigger upper limit - try it with MySql > with sum exceeding 2^32. I suppose the result will be negative ... > Well, I asked this question to MySql mailing list: > sferac> What's meaning "inf" as a result during a SUM() ? > sferac> Is it an overflow warning, and if so what's the max range of SUM() > sferac> before to have an overflow ? > > mysql> update t set myfloat=myfloat*9.1; > sferac> Query OK, 1415 rows affected (4.94 sec) > > mysql> select sum(myfloat) from t; > sferac> 1 row in set (0.04 sec) > > sferac> +--------------+ > sferac> | sum(myfloat) | > sferac> +--------------+ > sferac> | Inf | > sferac> +--------------+ > > mysqld does all normal calculation with doubles (bit functions are > done with longlong); The range of a double is typical something like: > > #define DBL_MAX 1.7976931348623157E+308 > #define DBL_MIN 2.2250738585072014E-308 > > Yours, > Monty Isn't it a good solution to implement PostgreSQL SUM() function ? What do you think about ? ____________________________Jose' Soares Da Silva_____________________________ Via Bazzanese, 69 ),,,( www.sferacarta.com Casalecchio R.(BO) Italy +|o o|+ sferac@bo.nettuno.it Fax. ++39 51 6131537 /. .\ Tel. ++39 51 591054 ---------------------------------(_---_)-------------------------------------- "...when men cease to destroy the animal race the lion and the lamb can dwell together..." (Joseph Smith)
> > You can try to typecast to float8 > > > > I have no idea how MySql is doing SUM() - probably it uses unsigned int > > for SUM() so this gives u twice bigger upper limit - try it with MySql > > with sum exceeding 2^32. I suppose the result will be negative ... > > > > Well, I asked this question to MySql mailing list: > > > sferac> What's meaning "inf" as a result during a SUM() ? > > sferac> Is it an overflow warning, and if so what's the max range of SUM() > > sferac> before to have an overflow ? > > > > mysql> update t set myfloat=myfloat*9.1; > > sferac> Query OK, 1415 rows affected (4.94 sec) > > > > mysql> select sum(myfloat) from t; > > sferac> 1 row in set (0.04 sec) > > > > sferac> +--------------+ > > sferac> | sum(myfloat) | > > sferac> +--------------+ > > sferac> | Inf | > > sferac> +--------------+ > > > > mysqld does all normal calculation with doubles (bit functions are > > done with longlong); The range of a double is typical something like: > > > > #define DBL_MAX 1.7976931348623157E+308 > > #define DBL_MIN 2.2250738585072014E-308 The limits are the same as for Postgres. The difference is that Postgres throws an error on floating overflows, rather than returning "infinity". I can think of arguments for both behaviors, but one problem is that at least one older platform (vax) does not have IEEE arithmetic so can't work with "infinity" :( We do not currently use "longlong" for any integers in the backend, because I didn't know how this is implemented/supported for _every_ platform on our supported list. I had put an "int8" package into the contrib directory to test this, and to get feedback for other platforms, but have gotten only one response and it was for a platform (alpha) which I already knew would work. Perhaps I'll make a concerted effort to get this tested for v6.4; scrappy had suggested just putting int8s into the backend and then working out the porting issues and I think I'll do that :) Then, we can use int8 variables in other places where necessary for v6.5... - Tom
On Tue, 17 Feb 1998, Thomas G. Lockhart wrote: > > > You can try to typecast to float8 > > > > > > I have no idea how MySql is doing SUM() - probably it uses unsigned int > > > for SUM() so this gives u twice bigger upper limit - try it with MySql > > > with sum exceeding 2^32. I suppose the result will be negative ... > > > > > > > Well, I asked this question to MySql mailing list: > > > > > sferac> What's meaning "inf" as a result during a SUM() ? > > > sferac> Is it an overflow warning, and if so what's the max range of SUM() > > > sferac> before to have an overflow ? > > > > > > mysql> update t set myfloat=myfloat*9.1; > > > sferac> Query OK, 1415 rows affected (4.94 sec) > > > > > > mysql> select sum(myfloat) from t; > > > sferac> 1 row in set (0.04 sec) > > > > > > sferac> +--------------+ > > > sferac> | sum(myfloat) | > > > sferac> +--------------+ > > > sferac> | Inf | > > > sferac> +--------------+ > > > > > > mysqld does all normal calculation with doubles (bit functions are > > > done with longlong); The range of a double is typical something like: > > > > > > #define DBL_MAX 1.7976931348623157E+308 > > > #define DBL_MIN 2.2250738585072014E-308 > > The limits are the same as for Postgres. The difference is that Postgres throws > an error on floating overflows, rather than returning "infinity". I can think of > arguments for both behaviors, but one problem is that at least one older platform > (vax) does not have IEEE arithmetic so can't work with "infinity" :( > > We do not currently use "longlong" for any integers in the backend, because I > didn't know how this is implemented/supported for _every_ platform on our > supported list. I had put an "int8" package into the contrib directory to test > this, and to get feedback for other platforms, but have gotten only one response > and it was for a platform (alpha) which I already knew would work. > > Perhaps I'll make a concerted effort to get this tested for v6.4; scrappy had > suggested just putting int8s into the backend and then working out the porting > issues and I think I'll do that :) Then, we can use int8 variables in other > places where necessary for v6.5... > > - Tom > > _______________________________________________________________________________ Aleksandr Belinsky Sternberg Astronomical Institute, Moscow University (Russia) E-mail: aleks@sai.msu.su http://zeus.sai.msu.su/~aleks/ _____________________________________________
On Tue, 17 Feb 1998, Thomas G. Lockhart wrote: > > > You can try to typecast to float8 > > > > > > I have no idea how MySql is doing SUM() - probably it uses unsigned int > > > for SUM() so this gives u twice bigger upper limit - try it with MySql > > > with sum exceeding 2^32. I suppose the result will be negative ... > > > > > > > Well, I asked this question to MySql mailing list: > > > > > sferac> What's meaning "inf" as a result during a SUM() ? > > > sferac> Is it an overflow warning, and if so what's the max range of SUM() > > > sferac> before to have an overflow ? > > > > > > mysql> update t set myfloat=myfloat*9.1; > > > sferac> Query OK, 1415 rows affected (4.94 sec) > > > > > > mysql> select sum(myfloat) from t; > > > sferac> 1 row in set (0.04 sec) > > > > > > sferac> +--------------+ > > > sferac> | sum(myfloat) | > > > sferac> +--------------+ > > > sferac> | Inf | > > > sferac> +--------------+ > > > > > > mysqld does all normal calculation with doubles (bit functions are > > > done with longlong); The range of a double is typical something like: > > > > > > #define DBL_MAX 1.7976931348623157E+308 > > > #define DBL_MIN 2.2250738585072014E-308 > > The limits are the same as for Postgres. The difference is that Postgres throws > an error on floating overflows, rather than returning "infinity". I think overflows on float8 is OK, the problem is about overflows on int4, int2 and float4. If PostgreSQL does all calculation using float8 (even on int2, int4 and float4) we have acceptable overflows. Why we don't do this in this way ? I can think of > arguments for both behaviors, but one problem is that at least one older platform > (vax) does not have IEEE arithmetic so can't work with "infinity" :( > > We do not currently use "longlong" for any integers in the backend, because I > didn't know how this is implemented/supported for _every_ platform on our > supported list. I had put an "int8" package into the contrib directory to test > this, and to get feedback for other platforms, but have gotten only one response > and it was for a platform (alpha) which I already knew would work. > > Perhaps I'll make a concerted effort to get this tested for v6.4; scrappy had > suggested just putting int8s into the backend and then working out the porting > issues and I think I'll do that :) Then, we can use int8 variables in other > places where necessary for v6.5... Ciao, Jose'
> > On Tue, 17 Feb 1998, Thomas G. Lockhart wrote: > > > > > You can try to typecast to float8 > > > > > > > > I have no idea how MySql is doing SUM() - probably it uses unsigned int > > > > for SUM() so this gives u twice bigger upper limit - try it with MySql > > > > with sum exceeding 2^32. I suppose the result will be negative ... > > > > > > > > > > Well, I asked this question to MySql mailing list: > > > > > > > sferac> What's meaning "inf" as a result during a SUM() ? > > > > sferac> Is it an overflow warning, and if so what's the max range of SUM() > > > > sferac> before to have an overflow ? > > > > > > > > mysql> update t set myfloat=myfloat*9.1; > > > > sferac> Query OK, 1415 rows affected (4.94 sec) > > > > > > > > mysql> select sum(myfloat) from t; > > > > sferac> 1 row in set (0.04 sec) > > > > > > > > sferac> +--------------+ > > > > sferac> | sum(myfloat) | > > > > sferac> +--------------+ > > > > sferac> | Inf | > > > > sferac> +--------------+ > > > > > > > > mysqld does all normal calculation with doubles (bit functions are > > > > done with longlong); The range of a double is typical something like: > > > > > > > > #define DBL_MAX 1.7976931348623157E+308 > > > > #define DBL_MIN 2.2250738585072014E-308 > > > > The limits are the same as for Postgres. The difference is that Postgres throws > > an error on floating overflows, rather than returning "infinity". > > I think overflows on float8 is OK, the problem is about overflows on int4, > int2 and float4. > If PostgreSQL does all calculation using float8 (even on int2, int4 and float4) > we have acceptable overflows. Why we don't do this in this way ? Can you imagine the problems we would have? Integral types and floats are two different things. They behave differently in division, rounding, and overflows. For those reasons, and performance, we can't just use floats all the time. -- Bruce Momjian maillist@candle.pha.pa.us
On Thu, 19 Feb 1998, Bruce Momjian wrote: > > > > On Tue, 17 Feb 1998, Thomas G. Lockhart wrote: > > > > > > > You can try to typecast to float8 > > > > > > > > > > I have no idea how MySql is doing SUM() - probably it uses unsigned int > > > > > for SUM() so this gives u twice bigger upper limit - try it with MySql > > > > > with sum exceeding 2^32. I suppose the result will be negative ... > > > > > > > > > > > > > Well, I asked this question to MySql mailing list: > > > > > > > > > sferac> What's meaning "inf" as a result during a SUM() ? > > > > > sferac> Is it an overflow warning, and if so what's the max range of SUM() > > > > > sferac> before to have an overflow ? > > > > > > > > > > mysql> update t set myfloat=myfloat*9.1; > > > > > sferac> Query OK, 1415 rows affected (4.94 sec) > > > > > > > > > > mysql> select sum(myfloat) from t; > > > > > sferac> 1 row in set (0.04 sec) > > > > > > > > > > sferac> +--------------+ > > > > > sferac> | sum(myfloat) | > > > > > sferac> +--------------+ > > > > > sferac> | Inf | > > > > > sferac> +--------------+ > > > > > > > > > > mysqld does all normal calculation with doubles (bit functions are > > > > > done with longlong); The range of a double is typical something like: > > > > > > > > > > #define DBL_MAX 1.7976931348623157E+308 > > > > > #define DBL_MIN 2.2250738585072014E-308 > > > > > > The limits are the same as for Postgres. The difference is that Postgres throws > > > an error on floating overflows, rather than returning "infinity". > > > > I think overflows on float8 is OK, the problem is about overflows on int4, > > int2 and float4. > > If PostgreSQL does all calculation using float8 (even on int2, int4 and float4) > > we have acceptable overflows. Why we don't do this in this way ? > > Can you imagine the problems we would have? Integral types and floats > are two different things. They behave differently in division, > rounding, and overflows. For those reasons, and performance, we can't > just use floats all the time. > I see..., then what we need is int8 as Tom said, to avoid to write code as: select cast mysmall as int4 * cast mysmall as int4 from mytable; select mysmall::int4 * mysmall::int4 from mytable; select int4(mysmall) * int4(mysmall) from mytable; select sum(int4(mysmall)) from mytable; that has nothing to do with SQL standard, and isn't too friendly, those conversions should be done automaticly by PostgreSQL whitout user's help as MySQL, Solid and other databases do. ----------- PS: I installed int8 in my Linux box, but I have an error during insert time: insert into qtest values('123','456'); ERROR: fmgr_info: function 159745: cache lookup failed insert into qtest values('123','4567890123456789'); ERROR: fmgr_info: function 159745: cache lookup failed insert into qtest values('4567890123456789','123'); ERROR: fmgr_info: function 159745: cache lookup failed insert into qtest values('4567890123456789','4567890123456789'); ERROR: fmgr_info: function 159745: cache lookup failed insert into qtest values('4567890123456789','-4567890123456789'); ERROR: fmgr_info: function 159745: cache lookup failed Ciao, Jose'
> > > > The limits are the same as for Postgres. The difference is that Postgres throws > > > > an error on floating overflows, rather than returning "infinity". > > > > > > I think overflows on float8 is OK, the problem is about overflows on int4, > > > int2 and float4. What would it take to have backends start signalling ERROR (not core dump :) on integer overflows? It would be nice to have the compile-time option; perhaps it is already somewhere in the backend. The SQL92 standard requires an abort on floating overflows, though _just_ because the standard says it doesn't mean we have to do it. We should do what makes sense. However, once an IEEE value becomes "Inf" further math operations keep it at "Inf", which propagates the problem farther into your calculation. Probably better to signal the error and let the user fix the overflow problem first. However, if you want to generate compile-time patches... > > > If PostgreSQL does all calculation using float8 (even on int2, int4 and float4) > > > we have acceptable overflows. Why we don't do this in this way ? > > Can you imagine the problems we would have? Integral types and floats > > are two different things. They behave differently in division, > > rounding, and overflows. For those reasons, and performance, we can't > > just use floats all the time. > > > I see..., then what we need is int8 as Tom said, to avoid to write code as: > > select cast mysmall as int4 * cast mysmall as int4 from mytable; > select mysmall::int4 * mysmall::int4 from mytable; > select int4(mysmall) * int4(mysmall) from mytable; > select sum(int4(mysmall)) from mytable; > > that has nothing to do with SQL standard, and isn't too friendly, > those conversions should be done automaticly by PostgreSQL whitout > user's help as MySQL, Solid and other databases do. Well, not all real databases do this, though I see your point. There is a performance tradeoff between always promoting types when doing arithmetic and leaving the type consistant. Also, "hidden" promotions may be less intuitive for some users, and lead to unexpected behavior elsewhere. > PS: I installed int8 in my Linux box, but I have an error during insert time: > insert into qtest values('123','456'); > ERROR: fmgr_info: function 159745: cache lookup failed Hmm. Works on my machine (980217 cvs tree): create table qtest(q1 int8, q2 int8); CREATE ... insert into qtest values('123','4567890123456789'); INSERT 1018571 1 ... select * from qtest; q1| q2 ----------------+----------------- 123| 456 123| 4567890123456789 4567890123456789| 123 4567890123456789| 4567890123456789 4567890123456789|-4567890123456789 (5 rows) Did you try from a clean database? How about from a clean install?? - Tom
On Fri, 20 Feb 1998, Thomas G. Lockhart wrote: > > > > > The limits are the same as for Postgres. The difference is that Postgres throws > > > > > an error on floating overflows, rather than returning "infinity". > > > > > > > > I think overflows on float8 is OK, the problem is about overflows on int4, > > > > int2 and float4. > > What would it take to have backends start signalling ERROR (not core dump :) on integer > overflows? It would be nice to have the compile-time option; perhaps it is already > somewhere in the backend. > > The SQL92 standard requires an abort on floating overflows, though _just_ because the > standard says it doesn't mean we have to do it. We should do what makes sense. However, > once an IEEE value becomes "Inf" further math operations keep it at "Inf", which > propagates the problem farther into your calculation. Probably better to signal the > error and let the user fix the overflow problem first. However, if you want to generate > compile-time patches... > > > > > If PostgreSQL does all calculation using float8 (even on int2, int4 and float4) > > > > we have acceptable overflows. Why we don't do this in this way ? > > > Can you imagine the problems we would have? Integral types and floats > > > are two different things. They behave differently in division, > > > rounding, and overflows. For those reasons, and performance, we can't > > > just use floats all the time. > > > > > I see..., then what we need is int8 as Tom said, to avoid to write code as: > > > > select cast mysmall as int4 * cast mysmall as int4 from mytable; > > select mysmall::int4 * mysmall::int4 from mytable; > > select int4(mysmall) * int4(mysmall) from mytable; > > select sum(int4(mysmall)) from mytable; > > > > that has nothing to do with SQL standard, and isn't too friendly, > > those conversions should be done automaticly by PostgreSQL whitout > > user's help as MySQL, Solid and other databases do. > > Well, not all real databases do this, though I see your point. There is a performance > tradeoff between always promoting types when doing arithmetic and leaving the type > consistant. Also, "hidden" promotions may be less intuitive for some users, and lead to > unexpected behavior elsewhere. > Hi Tom, Apologies for my insistence and for my bad english, I like to watch things from point of view of end users. (are databases for end users or programmers ??) We programmers, often forget that end users don't want to know what's happening behind their own screen, they don't want write ostrogothics sentences ala C to query a database, they had learned SQL whith so much effort and now they want to query database without thought about dificulties, that C find when it need convert numbers from one type to another. I thing that one of the bigest dificulty of PostgreSQL is how it treats numbers. SQL standard says that a number is comparable to, and compatible with, all other numbers (that is: all numbers are mutually comparable and mutually assignable). I think it means directly, without force it with the CAST or functions. I can't imagine end users using things like this: UPDATE my_table SET my_small=int2(my_int4); SELECT SUM(float8(my_int) FROM my_table; SELECT mysmall::int4 * mysmall::int4 FROM my_table; End users hate uneccessaries or non sense messages like the following when he/she writes a sentence like this: UPDATE my_table SET my_int=my_small; and PostgreSQL show him/her this message ? --------------------------------------------------------------------- WARN: type of my_small does not match target column my_int --------------------------------------------------------------------- - What message is this ? If PostgreSQL understand that I want to do: UPDATE my_table SET my_int=my_small::int4; why it doesn't do that for me? SELECT my_small * my_float FROM my_table; --------------------------------------------------------------------- NOTICE:there is no operator * for types int2 and float8 NOTICE:You will either have to retype this query using an NOTICE:explicit cast, or you will have to define the operator WARN:* for int2 and float8 using CREATE OPERATOR --------------------------------------------------------------------- - Well, if PostgreSQL know what I want why it doesn't do it for me ? I know that conversion is more expensive than warnning but it's more friendly. What do you think? ;-) Regards, Jose'
> I thing that one of the bigest dificulty of PostgreSQL is how it treats numbers. > SQL standard says that a number is comparable to, and compatible with, all > other numbers (that is: all numbers are mutually comparable and mutually > assignable). I think it means directly, without force it with the CAST or > functions. I agree that the numeric (and other) type coersions need some more work. One of the reasons why we have been reluctant to embed _too_ many automatic coersions into the backend is that it does not extend well into the type-extensible features of Postgres. But I hope to look at this some for v6.4. > and PostgreSQL show him/her this message ? > WARN: type of my_small does not match target column my_int this is an example of behavior which I think we _can_ change... > - Well, if PostgreSQL know what I want why it doesn't do it for me ? > I know that conversion is more expensive than warnning but it's more > friendly. What do you think? ;-) I agree up to a point. Once a user has chosen to use an "optimized data type" like "int2" rather than "integer" or "int4" then they have also chosen to give up some capabilities of the generic integer type such as dynamic range. However, some extra conversion stuff can and should happen, and I have some ideas on how to do this. If any hacker wants to work on it, let's start talking after the upcoming v6.3 release... - Tom