Thread: Function trunc() behaves in unexpected manner with different data types
Function trunc() behaves in unexpected manner with different data types
From
"Nathan M. Davalos"
Date:
I ran into something interesting with using trunc() and different data types: The following is a simplified from the statement we're using and produces the same results: =20 select trunc( ((cast(2183.68 as numeric) - cast(1 as numeric)) )*100) /100=20 yields 2184.68 =20 select trunc(((cast(2183.68 as numeric) - cast(1 as double precision) ))*100) /100 yields 2184.67 =20 select trunc(cast(2184.68 as double precision) *100)=20 yields 218467 instead of 218468 =20 This only happens on certain ranges of numbers. Doing the same thing with the number 3183.68 yields the same result in both cases. It only appears to happen when a number is declared as a double and there is no number past the last significant digit or the number past the last significant digit is a zero AND falls within a certain range of numbers. For instance select trunc(cast(2184.681 as double precision) *100) yields 218468, but select trunc(cast(2184.680 as double precision) *100) yields 218467. =20 I already made sure everything we're using is just defined as numeric to avoid the issue.
Re: Function trunc() behaves in unexpected manner with different data types
From
Merlin Moncure
Date:
On Thu, Feb 24, 2011 at 1:01 PM, Nathan M. Davalos <n.davalos@sharedmarketing.com> wrote: > I ran into something interesting with using trunc() and different data > types: > > The following is a simplified from the statement we=92re using and produc= es > the same results: > > select trunc( ((cast(2183.68 as numeric) - cast(1 as numeric)) )*100) /100 > > =A0=A0yields 2184.68 the root issue I think here is that the string version of the double precision math is approximated: postgres=3D# create table v as select floor(2183.68::float8 * 100) as v; postgres=3D# select * from v; v -------- 218367 (1 row) postgres=3D# select floor(v) from v; floor -------- 218367 postgres=3D# insert into v select 218368; INSERT 0 1 (1 row) postgres=3D# select distinct v from v; v -------- 218368 218368 (2 rows) As you can see, even though the string versions are the same, the internal representation is different. You could dump the data and restore it and get different results. Also the text/binary protocols would send different data to the client. I don't know if this is a bug in postgresql floating point implementation or not: i think the backend would either have to print 218367.999999999999999ish number or spend the time to look for these cases and round them internally. Floating point is a headache :-). merlin
On Thu, Feb 24, 2011 at 7:31 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > the root issue I think here is that the string version of the double > precision math is approximated: No, it's simpler than that, all double precision math is approximated. The root issue is that 2183.67 is not representable in a floating point binary number. Just like 1/3 isn't representable in base 10 (decimal) numbers many fractions aren't representable in base 2 (binary) numbers. The result are repeated decimals like 0.3333... if you multiply that by three you get 0.99999 and if you truncate that you get 0 insted of 1. It's the trunc() that's exposing the imprecision because like "=" it depends on the precise value of the number down to the last digit. Though depending on the arithmetic you can always make the precision expand beyond the last digit anyways -- when you multiply by 100 you magnify that imprecision too. -- greg
Re: Function trunc() behaves in unexpected manner with different data types
From
Merlin Moncure
Date:
On Thu, Feb 24, 2011 at 8:03 PM, Greg Stark <gsstark@mit.edu> wrote: > On Thu, Feb 24, 2011 at 7:31 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> the root issue I think here is that the string version of the double >> precision math is approximated: > > No, it's simpler than that, all double precision math is approximated. > The root issue is that 2183.67 is not representable in a floating > point binary number. Just like 1/3 isn't representable in base 10 > (decimal) numbers many fractions aren't representable in base 2 > (binary) numbers. The result are repeated decimals like 0.3333... if you > multiply that by three you get 0.99999 and if you truncate that you > get 0 insted of 1. > > It's the trunc() that's exposing the imprecision because like "=" it > depends on the precise value of the number down to the last digit. > Though depending on the arithmetic you can always make the precision > expand beyond the last digit anyways -- when you multiply by 100 you > magnify that imprecision too. right -- in understand how floating point works -- but are you are saying that you are ok with the fact that (for example) a table with a floating point unique key could dump and not restore? more specifically, a binary dump would restore but a text dump would not. I think this is a problem with our implementation -- not all versions of 2183.68 as outputted from the server are the same internally. put another way, text output from the server should unambiguously match what sourced the text. in the case of floating point, it does not...there are N versions of internal data that can match particular text output. I am speculating that the rounding is happening in the wrong place maybe. merlin
Merlin Moncure <mmoncure@gmail.com> writes: > right -- in understand how floating point works -- but are you are > saying that you are ok with the fact that (for example) a table with a > floating point unique key could dump and not restore? more > specifically, a binary dump would restore but a text dump would not. pg_dump takes measures against that (see extra_float_digits). > I think this is a problem with our implementation -- not all versions > of 2183.68 as outputted from the server are the same internally. It's an inherent property of float math. Yes, we could set the default value of extra_float_digits high enough that distinct internal values always had distinct text representations, but trust me, you would not like it. regards, tom lane
Re: Function trunc() behaves in unexpected manner with different data types
From
Merlin Moncure
Date:
On Fri, Feb 25, 2011 at 9:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Merlin Moncure <mmoncure@gmail.com> writes: >> right -- in understand how floating point works -- but are you are >> saying that you are ok with the fact that (for example) a table with a >> floating point unique key could dump and not restore? more >> specifically, a binary dump would restore but a text dump would not. > > pg_dump takes measures against that (see extra_float_digits). > >> I think this is a problem with our implementation -- not all versions >> of 2183.68 as outputted from the server are the same internally. > > It's an inherent property of float math. =A0Yes, we could set the default > value of extra_float_digits high enough that distinct internal values > always had distinct text representations, but trust me, you would not > like it. no I wouldn't, and the pg_dump extra_float_digits setting addresses my primary concern. The client has a similar issue though -- suppose it fetches a value from the server and updates it back -- which record gets the update? You would get different results if the client was using binary or text features of the protocol. Not saying this is wrong or needs to be fixed, just pointing it out :-). update foo set val=3Dval + 1 where val =3D 2183.68; merlin
Re: Function trunc() behaves in unexpected manner with different data types
From
Alvaro Herrera
Date:
Excerpts from Merlin Moncure's message of vie feb 25 12:28:25 -0300 2011: > no I wouldn't, and the pg_dump extra_float_digits setting addresses my > primary concern. The client has a similar issue though -- suppose it > fetches a value from the server and updates it back -- which record > gets the update? You would get different results if the client was > using binary or text features of the protocol. Not saying this is > wrong or needs to be fixed, just pointing it out :-). > > update foo set val=val + 1 where val = 2183.68; I think the mere idea of using floating point equality on a WHERE clause is bogus, regardless of text or binary format. -- Ãlvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Re: Function trunc() behaves in unexpected manner with different data types
From
Merlin Moncure
Date:
On Fri, Feb 25, 2011 at 9:40 AM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Excerpts from Merlin Moncure's message of vie feb 25 12:28:25 -0300 2011: > >> no I wouldn't, and the pg_dump extra_float_digits setting addresses my >> primary concern. =A0The client has a similar issue though -- suppose it >> fetches a value from the server and updates it back -- which record >> gets the update? =A0You would get different results if the client was >> using binary or text features of the protocol. =A0Not saying this is >> wrong or needs to be fixed, just pointing it out :-). >> >> update foo set val=3Dval + 1 where val =3D 2183.68; > > I think the mere idea of using floating point equality on a > WHERE clause is bogus, regardless of text or binary format. That's a bridge to far -- akin to saying floating point should not support equality operator. select count(*) from foo where val >=3D 2183.68? you are ok getting different answers depending on method of transmission of 2183.68 to the server? merlin
Re: Function trunc() behaves in unexpected manner with different data types
From
Merlin Moncure
Date:
On Fri, Feb 25, 2011 at 9:48 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > On Fri, Feb 25, 2011 at 9:40 AM, Alvaro Herrera > <alvherre@commandprompt.com> wrote: >> Excerpts from Merlin Moncure's message of vie feb 25 12:28:25 -0300 2011: >> >>> no I wouldn't, and the pg_dump extra_float_digits setting addresses my >>> primary concern. =A0The client has a similar issue though -- suppose it >>> fetches a value from the server and updates it back -- which record >>> gets the update? =A0You would get different results if the client was >>> using binary or text features of the protocol. =A0Not saying this is >>> wrong or needs to be fixed, just pointing it out :-). >>> >>> update foo set val=3Dval + 1 where val =3D 2183.68; >> >> I think the mere idea of using floating point equality on a >> WHERE clause is bogus, regardless of text or binary format. > > That's a bridge to[sic] far -- akin to saying floating point should not > support equality operator. =A0select count(*) from foo where val >=3D > 2183.68? =A0you are ok getting different answers depending on method of > transmission of 2183.68 to the server? I stand corrected -- I did some digging and Postgres's handling of this issue is afaict correct: you are supposed to round on presentation only, and equality matching on floating point in sql (just like in C) is capricious exercise at best, at least without some defenses. So, we can definitely file under 'not a bug'. merlin