Re: Strange results when casting string to double - Mailing list pgsql-general

From Carsten Klein
Subject Re: Strange results when casting string to double
Date
Msg-id 4b255a51-d9cd-8957-0d56-3b022ec0f3ad@datagis.com
Whole thread Raw
In response to Re: Strange results when casting string to double  (Gavan Schneider <list.pg.gavan@pendari.org>)
List pgsql-general


On Thu, Feb 17, 2022 at 00:07 Gavan Schneider wrote:

> Harking back to my long distant formative years I was taught to never ever rely on equality tests when it came to
floatingpoint values. Nothing has changed in this regard.
 
> 
> If binary exact is part of the requirement then the answer is INTEGER or NUMERIC never FLOAT, REAL, DOUBLE or
anythingsimilar.
 
> 
> So, assuming the trigger function is the only source of this spurious grief one could apply the lesson taught in the
60sfrom the dawn of computing:
 
>     EQUALITY = absolute_value(op1 - op2)  < epsilon — where op1 & op2 are DOUBLE, and epsilon is smaller than you
carefor
 
> 
> Given the intrinsic (standards compliant) uncertainty when converting from absolute (e.g., string representation) to
floatingpoint there will never be value resolving why there are differences.
 
> 
> I suggest using the comparison that is appropriate to the representation of those values or fix the design by using
theproper representation.
 

Just some(!) background:

I know that there are other ways to compare _any_ floating point values. 
However, doing that for a whole ROW in a fast manner is not trivial (e. 
g. it tends to get quite slow). With the hstore extension and (so 
called) binary equality I've found a very fast way which is also fully 
generic. hstore uses text representation, so comparisons depend on how 
values are converted to and from text.


But all that is not the point. Double precision conversion algorithms 
are well defined and deterministic, so it should yield the same results 
when called with the same arguments (every time and on every machine). 
That is

1.56::double precision == 1.56000000000000005329070518201E0

1.56::double precision != 1.55999999999999983124610025698E0

With reduced precision (as in PG), that gives you:

1.56::double precision == 1.56

1.56::double precision != 1.55999999999999

However, one of my ProstgreSQL servers returns the latter (wrong) value. 
You can test this with C library function 'strtod' or with any of the 
online converters around:

https://www.binaryconvert.com/result_double.html

Click 'New Conversion' and enter 1.56 into the 'Decimal' field. Then 
'Convert to binary' or hit enter.

So, the primary problem of that PostgreSQL server is, that it converts 
text to double in a wrong way. Apart from any triggers, "binary 
equality" and whatever else I'm doing in this project, this has dramatic 
effects on the database, as it's messing up the values that I'm storing:

Imagine I do:

INSERT INTO foo (my_col) VALUES ('Hello World');

But the database has happily stored a different string:

SELECT my_col FROM foo;
     my_col
-------------
  Hello Worlc
(1 row)

Finding that string again may be done with "fuzzy search" or regular 
expressions, but the much better approach is the database not to let 
mess up the string while storing it.

Double precision values are limited in precision (in binary) and there 
are numbers, that cannot be stored exactly (as it's true for many 
decimal numbers, like 1/3 ~ 0.33333 as well). Nevertheless, with a given 
maximum of precision, the same values should have the same (well 
defined) binary value so that conversions between text and double should 
not change the value at any time on any machine.

Carsten



pgsql-general by date:

Previous
From: Gavan Schneider
Date:
Subject: Re: Strange results when casting string to double
Next
From: Thomas Kellerer
Date:
Subject: Re: Strange results when casting string to double