Thread: Precision loss casting float to numeric

Precision loss casting float to numeric

From
Chapman Flack
Date:
Back in

https://www.postgresql.org/message-id/4e384467-f28a-69ce-
75aa-4bc01125a39d%40anastigmatix.net

I got intrigued about casting float values to numeric. Two queries below
(one for float4, one for float8) show what happens for float values with
bits of precision from one up to the limit of the data type. Each value
generated has two 1 bits in the mantissa, one always at 2^0 and the other
at 2^(-p) for increasing p; in other words, each value is the sum of 1 and
2^(-p), and the last such value that compares unequal to 1 is the value
1 plus (FLT_EPSILON or DBL_EPSILON, respectively, for the machine). The
next value generated that way would be indistinguishable from 1.

I'm testing on Intel, IEEE-754 hardware, with 24 bits of precision for
float4 and 53 bits for float8.

TL;DR: casting these values to numeric loses their distinguishability
from 1, six bits early for float4 and five bits early for float8. Casting
to numeric and back again is even worse: you get only six of your 24 bits
back for float4, only 15 of 53 for float8.

Everyone expects the floats to be approximate types and that they
won't be able to exactly represent arbitrary values. However, that's
not how numeric is advertised, and the idea that a numeric can't safely
keep the precision of a float4 or float8, and give it back when you ask,
at best violates least astonishment and arguably is a bug.

I see where at least the first problem (5 or 6 bits lost casting to
numeric) comes in:


https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3298


https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3227

These conversions work by sprintf of the float value to its text
representation with FLT_DIG or DBL_DIG decimal digits of precision.

The mistake is subtle and understandable, because even the C standards
didn't catch it until recently, when they introduced the new constants
FLT_DECIMAL_DIG and DBL_DECIMAL_DIG, because they realized the constant
you need depends on what you're doing.

The old FLT_DIG and DBL_DIG tell you how many digits of a decimal
number you can count on recovering intact after conversion to float
or double and back to decimal. That's inside-out from what's needed
here, where the constant of interest is the number of decimal digits
necessary to reliably represent any float or double so you can get
those original bits back, and that's what the FLT_DECIMAL_DIG and
DBL_DECIMAL_DIG constants are, and they're larger (by 3 and 2 decimal
digits, respectively, on IEEE-754 hardware) than their FLT_DIG and
DBL_DIG counterparts.

So, a trivial fix for float4_numeric and float8_numeric would be to
change the constant used in the sprintf, and that would (I think) solve
at least the first precision-loss problem. But it would only compile
where the compiler is new enough to define the newer constants, and
my preferred fix is to just open-code the conversion using frexp
rather than going through the text representation at all. I'm working
on that patch.

The second problem (losing even more bits in the roundtrip to numeric
and back) suggests that in the other direction, numeric_float4 and
numeric_float8 need some love too, but right now I'm focused on the
to-numeric direction).

-Chap


Columns:
 place     - the place value of the 1 bit on the right; in the last row,
             this is the machine epsilon for the type.

 float4gt  - whether 1+place is distinguishable from 1 using
 float8gt    all float4/float8 operations; true for every row.

 numericgt - whether 1+place is still distinguishable from 1 after
             casting to numeric.

 rtgt      - whether the roundtrip, 1+place cast to numeric and back,
             is still distinguishable from 1. Ends up same as numericgt
             (on my platform anyway).

 rteq      - whether the roundtrip, 1+place cast to numeric and back,
             equals the original 1+place. Starts failing quite early.

WITH RECURSIVE
  f4(place) AS (
    VALUES (1::float4)
    UNION
    SELECT place/2::float4
    FROM f4
    WHERE 1::float4 + place/2::float4 > 1::float4
  )
SELECT
  place,
  1::float4 + place > 1::float4 AS float4gt,
  (1::float4 + place)::numeric > 1::numeric AS numericgt,
  (1::float4 + place)::numeric::float4 > 1::float4 AS rtgt,
  (1::float4 + place)::numeric::float4 = 1::float4 + place as rteq
FROM
  f4;

    place    | float4gt | numericgt | rtgt | rteq
-------------+----------+-----------+------+------
           1 | t        | t         | t    | t
         0.5 | t        | t         | t    | t
        0.25 | t        | t         | t    | t
       0.125 | t        | t         | t    | t
      0.0625 | t        | t         | t    | t
     0.03125 | t        | t         | t    | t
    0.015625 | t        | t         | t    | f
   0.0078125 | t        | t         | t    | f
  0.00390625 | t        | t         | t    | f
  0.00195312 | t        | t         | t    | f
 0.000976562 | t        | t         | t    | f
 0.000488281 | t        | t         | t    | f
 0.000244141 | t        | t         | t    | f
  0.00012207 | t        | t         | t    | f
 6.10352e-05 | t        | t         | t    | f
 3.05176e-05 | t        | t         | t    | f
 1.52588e-05 | t        | t         | t    | f
 7.62939e-06 | t        | t         | t    | f
  3.8147e-06 | t        | f         | f    | f
 1.90735e-06 | t        | f         | f    | f
 9.53674e-07 | t        | f         | f    | f
 4.76837e-07 | t        | f         | f    | f
 2.38419e-07 | t        | f         | f    | f
 1.19209e-07 | t        | f         | f    | f
(24 rows)

WITH RECURSIVE
  f8(place) AS (
    VALUES (1::float8)
    UNION
    SELECT place/2::float8
    FROM f8
    WHERE 1::float8 + place/2::float8 > 1::float8
  )
SELECT
  place,
  1::float8 + place > 1::float8 AS float8gt,
  (1::float8 + place)::numeric > 1::numeric AS numericgt,
  (1::float8 + place)::numeric::float8 > 1::float8 AS rtgt,
  (1::float8 + place)::numeric::float8 = 1::float8 + place as rteq
FROM
  f8;

        place         | float8gt | numericgt | rtgt | rteq
----------------------+----------+-----------+------+------
                    1 | t        | t         | t    | t
                  0.5 | t        | t         | t    | t
                 0.25 | t        | t         | t    | t
                0.125 | t        | t         | t    | t
               0.0625 | t        | t         | t    | t
              0.03125 | t        | t         | t    | t
             0.015625 | t        | t         | t    | t
            0.0078125 | t        | t         | t    | t
           0.00390625 | t        | t         | t    | t
          0.001953125 | t        | t         | t    | t
         0.0009765625 | t        | t         | t    | t
        0.00048828125 | t        | t         | t    | t
       0.000244140625 | t        | t         | t    | t
      0.0001220703125 | t        | t         | t    | t
      6.103515625e-05 | t        | t         | t    | t
     3.0517578125e-05 | t        | t         | t    | f
    1.52587890625e-05 | t        | t         | t    | f
    7.62939453125e-06 | t        | t         | t    | f
   3.814697265625e-06 | t        | t         | t    | f
  1.9073486328125e-06 | t        | t         | t    | f
  9.5367431640625e-07 | t        | t         | t    | f
 4.76837158203125e-07 | t        | t         | t    | f
 2.38418579101562e-07 | t        | t         | t    | f
 1.19209289550781e-07 | t        | t         | t    | f
 5.96046447753906e-08 | t        | t         | t    | f
 2.98023223876953e-08 | t        | t         | t    | f
 1.49011611938477e-08 | t        | t         | t    | f
 7.45058059692383e-09 | t        | t         | t    | f
 3.72529029846191e-09 | t        | t         | t    | f
 1.86264514923096e-09 | t        | t         | t    | f
 9.31322574615479e-10 | t        | t         | t    | f
 4.65661287307739e-10 | t        | t         | t    | f
  2.3283064365387e-10 | t        | t         | t    | f
 1.16415321826935e-10 | t        | t         | t    | f
 5.82076609134674e-11 | t        | t         | t    | f
 2.91038304567337e-11 | t        | t         | t    | f
 1.45519152283669e-11 | t        | t         | t    | f
 7.27595761418343e-12 | t        | t         | t    | f
 3.63797880709171e-12 | t        | t         | t    | f
 1.81898940354586e-12 | t        | t         | t    | f
 9.09494701772928e-13 | t        | t         | t    | f
 4.54747350886464e-13 | t        | t         | t    | f
 2.27373675443232e-13 | t        | t         | t    | f
 1.13686837721616e-13 | t        | t         | t    | f
  5.6843418860808e-14 | t        | t         | t    | f
  2.8421709430404e-14 | t        | t         | t    | f
  1.4210854715202e-14 | t        | t         | t    | f
   7.105427357601e-15 | t        | t         | t    | f
  3.5527136788005e-15 | t        | f         | f    | f
 1.77635683940025e-15 | t        | f         | f    | f
 8.88178419700125e-16 | t        | f         | f    | f
 4.44089209850063e-16 | t        | f         | f    | f
 2.22044604925031e-16 | t        | f         | f    | f
(53 rows)


Re: Precision loss casting float to numeric

From
Chapman Flack
Date:
Here are two patches.

The 0001-*.patch simply adds a regression test to numeric.sql that bits
aren't lost casting from float[48] to numeric. Naturally, it fails at this
stage.

The 0002-*.patch is a proof-of-concept patching float4_numeric and
float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and
DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new
regression test pass. (It will only work under a compiler that has
__FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used
those internal versions to avoid mucking with build tooling to change
the target C standard, which I assume wouldn't be welcome anyway.
This patch is just POC and not proposed as anything else.)

It changes the output of an existing numeric test and a few numeric
aggregate tests.

I have adjusted the numeric test: its purpose is to check the direction
of numeric rounding of ties, but the value to be rounded was computed
in float8 and then cast to numeric, and because negative powers of ten
aren't tidy in binary, it can turn out that the float8 computation
produces a correctly-rounded-53-bit-result that is on the nearer-to-zero
side of a tie, and now that that result is correctly cast, the resulting
numeric doesn't round in the away-from-zero direction.

I changed that test because I concluded it wasn't meant to test
float8-to-numeric casting, but only the rounding of tied numerics,
so I just made the inner expression be typed numeric, and the expected
output is unchanged.

The three aggregate tests with changed output are working from a table
of float4 values, and my assumption is they are now simply producing
the correct aggregate results given the full precision of the input
values, but I haven't confirmed that yet, so this patch leaves those
three failing for now.

-Chap

Attachment

Re: Precision loss casting float to numeric

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> The 0002-*.patch is a proof-of-concept patching float4_numeric and
> float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and
> DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new
> regression test pass. (It will only work under a compiler that has
> __FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used
> those internal versions to avoid mucking with build tooling to change
> the target C standard, which I assume wouldn't be welcome anyway.

Nope.  TBH, I'd think about just using "DBL_DIG + 3", given our existing
coding around extra_float_digits in places like pg_dump and postgres_fdw.
The knowledge that you need 2 or 3 extra digits is already well embedded.

Conceivably you could do it like

#ifndef DBL_DECIMAL_DIG
#ifdef __DBL_DECIMAL_DIG__
#define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__
#else
#define DBL_DECIMAL_DIG (DBL_DIG + 3)
#endif
#endif

but I'm not exactly seeing how that buys us anything.

The bigger question here is whether people actually want this behavioral
change.  I think there's probably a bigger chance of complaints that
"casting 1.1::float8 to numeric now produces some weird,
incorrectly-rounded result" than that we make anyone happier.

I have a vague idea that at some point in the past we discussed making
this conversion use extra_float_digits, which'd allow satisfying both
camps, at the nontrivial price that the conversion would have to be
considered stable not immutable.  We didn't pull the trigger, if this
memory is real at all, presumably because of the mutability issue.

Another idea would be to leave the cast alone and introduce a named
function that does the "exact" conversion.  Possibly that makes nobody
happy, but at least both the cast and the function could be immutable.
It'd dodge backwards-compatibility objections, too.

            regards, tom lane


Re: Precision loss casting float to numeric

From
Robert Haas
Date:
On Mon, Feb 26, 2018 at 1:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The bigger question here is whether people actually want this behavioral
> change.  I think there's probably a bigger chance of complaints that
> "casting 1.1::float8 to numeric now produces some weird,
> incorrectly-rounded result" than that we make anyone happier.

Yeah, I worry about that, too.

Of course, as you know, I also have a deep and abiding skepticism
about IEEE binary floats as a concept.  Anomalies are unavoidable; we
can choose exactly which set users experience, but we can eliminate
them because the underlying storage format is poorly-adapted to the
behavior people actually want.  It's too bad that IEEE decimal floats
weren't standardized until 2008.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Precision loss casting float to numeric

From
Chapman Flack
Date:
On 02/26/2018 01:29 PM, Tom Lane wrote:

> I think there's probably a bigger chance of complaints that
> "casting 1.1::float8 to numeric now produces some weird,
> incorrectly-rounded result" than that we make anyone happier.

Arguably, though, that's a moment that can be used to explain
exactly what the correctly-rounded value of 1.1::float8 is and
why, and then people both know something new and understand more
precisely what's happening to their data, and can apply round()
to it in exactly the places they want, if they want.

In contrast, the current fact that 1.1::float8 looks like 1.1
when cast to numeric puts a superficial smile on people's faces,
while they haven't really been asked how they feel about losing
five, six, or thirty-eight bits of precision when casting one data
type into another of supposedly greater precision and back. I think
the typical assumption is that, sure, you may lose precision
if you cast to a *less* precise type, but the other direction's
assumed value-preserving.

I can see the concern about changing behavior for code that may
exist already. I would never have thought of making the behavior
of a cast sensitive to extra_float_digits (in fact, I hadn't
known about extra_float_digits; it's described in the "locale
and formatting" section, which I never dreamed of consulting
for a cast between internal value representations; am I weird
in that?).

I wonder if an alternative to making a cast that can't be immutable,
because it looks at a GUC, could be to offer a choice of cast
functions: if you need the other behavior, create a schema, do a
CREATE CAST in it, and put it on your search path ahead of pg_catalog.
Kind of ugly, but that can happen dealing with kind of ugly situations.

-Chap


Re: Precision loss casting float to numeric

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> I wonder if an alternative to making a cast that can't be immutable,
> because it looks at a GUC, could be to offer a choice of cast
> functions: if you need the other behavior, create a schema, do a
> CREATE CAST in it, and put it on your search path ahead of pg_catalog.

Nope, because casts aren't schema-local, since they don't have names.
There can be only one cast between given source and target types.

            regards, tom lane


Re: Precision loss casting float to numeric

From
Bear Giles
Date:


On Mon, Feb 26, 2018 at 11:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chapman Flack <chap@anastigmatix.net> writes:
> The 0002-*.patch is a proof-of-concept patching float4_numeric and
> float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and
> DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new
> regression test pass. (It will only work under a compiler that has
> __FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used
> those internal versions to avoid mucking with build tooling to change
> the target C standard, which I assume wouldn't be welcome anyway.

Nope.  TBH, I'd think about just using "DBL_DIG + 3", given our existing
coding around extra_float_digits in places like pg_dump and postgres_fdw.
The knowledge that you need 2 or 3 extra digits is already well embedded.

Conceivably you could do it like

#ifndef DBL_DECIMAL_DIG
#ifdef __DBL_DECIMAL_DIG__
#define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__
#else
#define DBL_DECIMAL_DIG (DBL_DIG + 3)
#endif
#endif

but I'm not exactly seeing how that buys us anything.

The bigger question here is whether people actually want this behavioral
change.  I think there's probably a bigger chance of complaints that
"casting 1.1::float8 to numeric now produces some weird,
incorrectly-rounded result" than that we make anyone happier.

I have a vague idea that at some point in the past we discussed making
this conversion use extra_float_digits, which'd allow satisfying both
camps, at the nontrivial price that the conversion would have to be
considered stable not immutable.  We didn't pull the trigger, if this
memory is real at all, presumably because of the mutability issue.

Another idea would be to leave the cast alone and introduce a named
function that does the "exact" conversion.  Possibly that makes nobody
happy, but at least both the cast and the function could be immutable.
It'd dodge backwards-compatibility objections, too.

                        regards, tom lane
 
​Working for a company that ​
has enterprise customers this can't be overemphasized.
Never require the user to do something so they keep getting the same results.​
​ It doesn't
matter if it's "wrong".

​I would vote for a property. If you want the best effort to match the IEEE spec
you need to execute 'set use_ieee_numbers'  and you'll get the extra digits and
rounding behavior. If not ​you'll get the existing behavior.

Bear

Re: Precision loss casting float to numeric

From
Emre Hasegeli
Date:
>> I wonder if an alternative to making a cast that can't be immutable,
>> because it looks at a GUC, could be to offer a choice of cast
>> functions: if you need the other behavior, create a schema, do a
>> CREATE CAST in it, and put it on your search path ahead of pg_catalog.
>
> Nope, because casts aren't schema-local, since they don't have names.
> There can be only one cast between given source and target types.

In this case, I cannot see any other option than adding those as
separate cast functions.  Should we mark this entry as "returned with
feedback"?

We can also consider turning the current float to numeric casts to
explicit as they are causing data loss.  I am not sure how much it
would impact backwards-compatibility.  The counter argument is the
numeric to float casts being IMPLICIT.  They are causing data loss for
sure, but I believe there are different reasons to keep them as
IMPLICIT.


Re: Precision loss casting float to numeric

From
Chapman Flack
Date:
On 03/09/18 12:05, Emre Hasegeli wrote:
> In this case, I cannot see any other option than adding those as
> separate cast functions.  Should we mark this entry as "returned with
> feedback"?
> 
> We can also consider turning the current float to numeric casts to
> explicit as they are causing data loss.  I am not sure how much it
> would impact backwards-compatibility.  The counter argument is the
> numeric to float casts being IMPLICIT.  They are causing data loss for
> sure, but I believe there are different reasons to keep them as
> IMPLICIT.

Thanks for the feedback. I will mark it RWF myself, as the backward-
compatibility issues are kind of paralyzing, and I don't think I'll
have time in this CF to give it enough further thought anyway.

I wonder whether even changing a formerly-implicit cast to explicit
would be too much of a behavior change for existing code that expects
the current behavior?

-Chap


Re: Precision loss casting float to numeric

From
Tom Lane
Date:
Chapman Flack <chap@anastigmatix.net> writes:
> I wonder whether even changing a formerly-implicit cast to explicit
> would be too much of a behavior change for existing code that expects
> the current behavior?

We did exactly that during the 8.3 cycle, and got one heck of a lot of
pushback about it, despite the existence of a lot of examples showing
that it would be a good idea.  I won't say we can't do it again, but
it won't be an easy sell.

            regards, tom lane


Re: Precision loss casting float to numeric

From
Tom Lane
Date:
Emre Hasegeli <emre@hasegeli.com> writes:
> We can also consider turning the current float to numeric casts to
> explicit as they are causing data loss.  I am not sure how much it
> would impact backwards-compatibility.  The counter argument is the
> numeric to float casts being IMPLICIT.  They are causing data loss for
> sure, but I believe there are different reasons to keep them as
> IMPLICIT.

FWIW, the behavior of these casts is intended to model what the SQL
standard says about the behavior of "exact numeric" vs "approximate
numeric" types.  (In our implementation, the integral types plus numeric
are in the first category, float4 and float8 in the second.)  The spec is
perfectly clear that you can assign an approximate numeric to an exact
numeric, or vice versa, without any explicit cast.  It is also clear that
arithmetic operations combining approximate and exact are allowed without
a cast, and yield approximate results.  These rules lead to our
conclusions that exact -> approximate is an implicit cast (so that the
parser will choose eg. float8 multiplication over numeric multiplication
if you write numericvar * float8var) while approximate -> exact is an
assignment cast (so that you can assign float8 to numeric without explicit
casting).  If the decisions had been driven purely by "what risks silent
precision loss", no doubt we'd have done it differently ... but it's hard
to do that and still meet the letter of the spec.

            regards, tom lane