Thread: Precision loss casting float to numeric
Back in https://www.postgresql.org/message-id/4e384467-f28a-69ce- 75aa-4bc01125a39d%40anastigmatix.net I got intrigued about casting float values to numeric. Two queries below (one for float4, one for float8) show what happens for float values with bits of precision from one up to the limit of the data type. Each value generated has two 1 bits in the mantissa, one always at 2^0 and the other at 2^(-p) for increasing p; in other words, each value is the sum of 1 and 2^(-p), and the last such value that compares unequal to 1 is the value 1 plus (FLT_EPSILON or DBL_EPSILON, respectively, for the machine). The next value generated that way would be indistinguishable from 1. I'm testing on Intel, IEEE-754 hardware, with 24 bits of precision for float4 and 53 bits for float8. TL;DR: casting these values to numeric loses their distinguishability from 1, six bits early for float4 and five bits early for float8. Casting to numeric and back again is even worse: you get only six of your 24 bits back for float4, only 15 of 53 for float8. Everyone expects the floats to be approximate types and that they won't be able to exactly represent arbitrary values. However, that's not how numeric is advertised, and the idea that a numeric can't safely keep the precision of a float4 or float8, and give it back when you ask, at best violates least astonishment and arguably is a bug. I see where at least the first problem (5 or 6 bits lost casting to numeric) comes in: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3298 https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/numeric.c;h=6f400729713bfd942cc196b81d50bf25e4757315;hb=c4ba1bee68abe217e441fb81343e5f9e9e2a5353#l3227 These conversions work by sprintf of the float value to its text representation with FLT_DIG or DBL_DIG decimal digits of precision. The mistake is subtle and understandable, because even the C standards didn't catch it until recently, when they introduced the new constants FLT_DECIMAL_DIG and DBL_DECIMAL_DIG, because they realized the constant you need depends on what you're doing. The old FLT_DIG and DBL_DIG tell you how many digits of a decimal number you can count on recovering intact after conversion to float or double and back to decimal. That's inside-out from what's needed here, where the constant of interest is the number of decimal digits necessary to reliably represent any float or double so you can get those original bits back, and that's what the FLT_DECIMAL_DIG and DBL_DECIMAL_DIG constants are, and they're larger (by 3 and 2 decimal digits, respectively, on IEEE-754 hardware) than their FLT_DIG and DBL_DIG counterparts. So, a trivial fix for float4_numeric and float8_numeric would be to change the constant used in the sprintf, and that would (I think) solve at least the first precision-loss problem. But it would only compile where the compiler is new enough to define the newer constants, and my preferred fix is to just open-code the conversion using frexp rather than going through the text representation at all. I'm working on that patch. The second problem (losing even more bits in the roundtrip to numeric and back) suggests that in the other direction, numeric_float4 and numeric_float8 need some love too, but right now I'm focused on the to-numeric direction). -Chap Columns: place - the place value of the 1 bit on the right; in the last row, this is the machine epsilon for the type. float4gt - whether 1+place is distinguishable from 1 using float8gt all float4/float8 operations; true for every row. numericgt - whether 1+place is still distinguishable from 1 after casting to numeric. rtgt - whether the roundtrip, 1+place cast to numeric and back, is still distinguishable from 1. Ends up same as numericgt (on my platform anyway). rteq - whether the roundtrip, 1+place cast to numeric and back, equals the original 1+place. Starts failing quite early. WITH RECURSIVE f4(place) AS ( VALUES (1::float4) UNION SELECT place/2::float4 FROM f4 WHERE 1::float4 + place/2::float4 > 1::float4 ) SELECT place, 1::float4 + place > 1::float4 AS float4gt, (1::float4 + place)::numeric > 1::numeric AS numericgt, (1::float4 + place)::numeric::float4 > 1::float4 AS rtgt, (1::float4 + place)::numeric::float4 = 1::float4 + place as rteq FROM f4; place | float4gt | numericgt | rtgt | rteq -------------+----------+-----------+------+------ 1 | t | t | t | t 0.5 | t | t | t | t 0.25 | t | t | t | t 0.125 | t | t | t | t 0.0625 | t | t | t | t 0.03125 | t | t | t | t 0.015625 | t | t | t | f 0.0078125 | t | t | t | f 0.00390625 | t | t | t | f 0.00195312 | t | t | t | f 0.000976562 | t | t | t | f 0.000488281 | t | t | t | f 0.000244141 | t | t | t | f 0.00012207 | t | t | t | f 6.10352e-05 | t | t | t | f 3.05176e-05 | t | t | t | f 1.52588e-05 | t | t | t | f 7.62939e-06 | t | t | t | f 3.8147e-06 | t | f | f | f 1.90735e-06 | t | f | f | f 9.53674e-07 | t | f | f | f 4.76837e-07 | t | f | f | f 2.38419e-07 | t | f | f | f 1.19209e-07 | t | f | f | f (24 rows) WITH RECURSIVE f8(place) AS ( VALUES (1::float8) UNION SELECT place/2::float8 FROM f8 WHERE 1::float8 + place/2::float8 > 1::float8 ) SELECT place, 1::float8 + place > 1::float8 AS float8gt, (1::float8 + place)::numeric > 1::numeric AS numericgt, (1::float8 + place)::numeric::float8 > 1::float8 AS rtgt, (1::float8 + place)::numeric::float8 = 1::float8 + place as rteq FROM f8; place | float8gt | numericgt | rtgt | rteq ----------------------+----------+-----------+------+------ 1 | t | t | t | t 0.5 | t | t | t | t 0.25 | t | t | t | t 0.125 | t | t | t | t 0.0625 | t | t | t | t 0.03125 | t | t | t | t 0.015625 | t | t | t | t 0.0078125 | t | t | t | t 0.00390625 | t | t | t | t 0.001953125 | t | t | t | t 0.0009765625 | t | t | t | t 0.00048828125 | t | t | t | t 0.000244140625 | t | t | t | t 0.0001220703125 | t | t | t | t 6.103515625e-05 | t | t | t | t 3.0517578125e-05 | t | t | t | f 1.52587890625e-05 | t | t | t | f 7.62939453125e-06 | t | t | t | f 3.814697265625e-06 | t | t | t | f 1.9073486328125e-06 | t | t | t | f 9.5367431640625e-07 | t | t | t | f 4.76837158203125e-07 | t | t | t | f 2.38418579101562e-07 | t | t | t | f 1.19209289550781e-07 | t | t | t | f 5.96046447753906e-08 | t | t | t | f 2.98023223876953e-08 | t | t | t | f 1.49011611938477e-08 | t | t | t | f 7.45058059692383e-09 | t | t | t | f 3.72529029846191e-09 | t | t | t | f 1.86264514923096e-09 | t | t | t | f 9.31322574615479e-10 | t | t | t | f 4.65661287307739e-10 | t | t | t | f 2.3283064365387e-10 | t | t | t | f 1.16415321826935e-10 | t | t | t | f 5.82076609134674e-11 | t | t | t | f 2.91038304567337e-11 | t | t | t | f 1.45519152283669e-11 | t | t | t | f 7.27595761418343e-12 | t | t | t | f 3.63797880709171e-12 | t | t | t | f 1.81898940354586e-12 | t | t | t | f 9.09494701772928e-13 | t | t | t | f 4.54747350886464e-13 | t | t | t | f 2.27373675443232e-13 | t | t | t | f 1.13686837721616e-13 | t | t | t | f 5.6843418860808e-14 | t | t | t | f 2.8421709430404e-14 | t | t | t | f 1.4210854715202e-14 | t | t | t | f 7.105427357601e-15 | t | t | t | f 3.5527136788005e-15 | t | f | f | f 1.77635683940025e-15 | t | f | f | f 8.88178419700125e-16 | t | f | f | f 4.44089209850063e-16 | t | f | f | f 2.22044604925031e-16 | t | f | f | f (53 rows)
Here are two patches. The 0001-*.patch simply adds a regression test to numeric.sql that bits aren't lost casting from float[48] to numeric. Naturally, it fails at this stage. The 0002-*.patch is a proof-of-concept patching float4_numeric and float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new regression test pass. (It will only work under a compiler that has __FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used those internal versions to avoid mucking with build tooling to change the target C standard, which I assume wouldn't be welcome anyway. This patch is just POC and not proposed as anything else.) It changes the output of an existing numeric test and a few numeric aggregate tests. I have adjusted the numeric test: its purpose is to check the direction of numeric rounding of ties, but the value to be rounded was computed in float8 and then cast to numeric, and because negative powers of ten aren't tidy in binary, it can turn out that the float8 computation produces a correctly-rounded-53-bit-result that is on the nearer-to-zero side of a tie, and now that that result is correctly cast, the resulting numeric doesn't round in the away-from-zero direction. I changed that test because I concluded it wasn't meant to test float8-to-numeric casting, but only the rounding of tied numerics, so I just made the inner expression be typed numeric, and the expected output is unchanged. The three aggregate tests with changed output are working from a table of float4 values, and my assumption is they are now simply producing the correct aggregate results given the full precision of the input values, but I haven't confirmed that yet, so this patch leaves those three failing for now. -Chap
Attachment
Chapman Flack <chap@anastigmatix.net> writes: > The 0002-*.patch is a proof-of-concept patching float4_numeric and > float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and > DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new > regression test pass. (It will only work under a compiler that has > __FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used > those internal versions to avoid mucking with build tooling to change > the target C standard, which I assume wouldn't be welcome anyway. Nope. TBH, I'd think about just using "DBL_DIG + 3", given our existing coding around extra_float_digits in places like pg_dump and postgres_fdw. The knowledge that you need 2 or 3 extra digits is already well embedded. Conceivably you could do it like #ifndef DBL_DECIMAL_DIG #ifdef __DBL_DECIMAL_DIG__ #define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__ #else #define DBL_DECIMAL_DIG (DBL_DIG + 3) #endif #endif but I'm not exactly seeing how that buys us anything. The bigger question here is whether people actually want this behavioral change. I think there's probably a bigger chance of complaints that "casting 1.1::float8 to numeric now produces some weird, incorrectly-rounded result" than that we make anyone happier. I have a vague idea that at some point in the past we discussed making this conversion use extra_float_digits, which'd allow satisfying both camps, at the nontrivial price that the conversion would have to be considered stable not immutable. We didn't pull the trigger, if this memory is real at all, presumably because of the mutability issue. Another idea would be to leave the cast alone and introduce a named function that does the "exact" conversion. Possibly that makes nobody happy, but at least both the cast and the function could be immutable. It'd dodge backwards-compatibility objections, too. regards, tom lane
On Mon, Feb 26, 2018 at 1:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The bigger question here is whether people actually want this behavioral > change. I think there's probably a bigger chance of complaints that > "casting 1.1::float8 to numeric now produces some weird, > incorrectly-rounded result" than that we make anyone happier. Yeah, I worry about that, too. Of course, as you know, I also have a deep and abiding skepticism about IEEE binary floats as a concept. Anomalies are unavoidable; we can choose exactly which set users experience, but we can eliminate them because the underlying storage format is poorly-adapted to the behavior people actually want. It's too bad that IEEE decimal floats weren't standardized until 2008. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/26/2018 01:29 PM, Tom Lane wrote: > I think there's probably a bigger chance of complaints that > "casting 1.1::float8 to numeric now produces some weird, > incorrectly-rounded result" than that we make anyone happier. Arguably, though, that's a moment that can be used to explain exactly what the correctly-rounded value of 1.1::float8 is and why, and then people both know something new and understand more precisely what's happening to their data, and can apply round() to it in exactly the places they want, if they want. In contrast, the current fact that 1.1::float8 looks like 1.1 when cast to numeric puts a superficial smile on people's faces, while they haven't really been asked how they feel about losing five, six, or thirty-eight bits of precision when casting one data type into another of supposedly greater precision and back. I think the typical assumption is that, sure, you may lose precision if you cast to a *less* precise type, but the other direction's assumed value-preserving. I can see the concern about changing behavior for code that may exist already. I would never have thought of making the behavior of a cast sensitive to extra_float_digits (in fact, I hadn't known about extra_float_digits; it's described in the "locale and formatting" section, which I never dreamed of consulting for a cast between internal value representations; am I weird in that?). I wonder if an alternative to making a cast that can't be immutable, because it looks at a GUC, could be to offer a choice of cast functions: if you need the other behavior, create a schema, do a CREATE CAST in it, and put it on your search path ahead of pg_catalog. Kind of ugly, but that can happen dealing with kind of ugly situations. -Chap
Chapman Flack <chap@anastigmatix.net> writes: > I wonder if an alternative to making a cast that can't be immutable, > because it looks at a GUC, could be to offer a choice of cast > functions: if you need the other behavior, create a schema, do a > CREATE CAST in it, and put it on your search path ahead of pg_catalog. Nope, because casts aren't schema-local, since they don't have names. There can be only one cast between given source and target types. regards, tom lane
On Mon, Feb 26, 2018 at 11:29 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chapman Flack <chap@anastigmatix.net> writes:
> The 0002-*.patch is a proof-of-concept patching float4_numeric and
> float8_numeric in the trivial way (just using FLT_DECIMAL_DIG and
> DBL_DECIMAL_DIG in place of FLT_DIG and DBL_DIG). It makes the new
> regression test pass. (It will only work under a compiler that has
> __FLT_DECIMAL_DIG__ and __DBL_DECIMAL_DIG__ available, and I used
> those internal versions to avoid mucking with build tooling to change
> the target C standard, which I assume wouldn't be welcome anyway.
Nope. TBH, I'd think about just using "DBL_DIG + 3", given our existing
coding around extra_float_digits in places like pg_dump and postgres_fdw.
The knowledge that you need 2 or 3 extra digits is already well embedded.
Conceivably you could do it like
#ifndef DBL_DECIMAL_DIG
#ifdef __DBL_DECIMAL_DIG__
#define DBL_DECIMAL_DIG __DBL_DECIMAL_DIG__
#else
#define DBL_DECIMAL_DIG (DBL_DIG + 3)
#endif
#endif
but I'm not exactly seeing how that buys us anything.
The bigger question here is whether people actually want this behavioral
change. I think there's probably a bigger chance of complaints that
"casting 1.1::float8 to numeric now produces some weird,
incorrectly-rounded result" than that we make anyone happier.
I have a vague idea that at some point in the past we discussed making
this conversion use extra_float_digits, which'd allow satisfying both
camps, at the nontrivial price that the conversion would have to be
considered stable not immutable. We didn't pull the trigger, if this
memory is real at all, presumably because of the mutability issue.
Another idea would be to leave the cast alone and introduce a named
function that does the "exact" conversion. Possibly that makes nobody
happy, but at least both the cast and the function could be immutable.
It'd dodge backwards-compatibility objections, too.
regards, tom lane
Working for a company that
has enterprise customers this can't be overemphasized.
Never require the user to do something so they keep getting the same results.
It doesn't
matter if it's "wrong".
I would vote for a property. If you want the best effort to match the IEEE spec
you need to execute 'set use_ieee_numbers' and you'll get the extra digits and
rounding behavior. If not you'll get the existing behavior.
Bear
>> I wonder if an alternative to making a cast that can't be immutable, >> because it looks at a GUC, could be to offer a choice of cast >> functions: if you need the other behavior, create a schema, do a >> CREATE CAST in it, and put it on your search path ahead of pg_catalog. > > Nope, because casts aren't schema-local, since they don't have names. > There can be only one cast between given source and target types. In this case, I cannot see any other option than adding those as separate cast functions. Should we mark this entry as "returned with feedback"? We can also consider turning the current float to numeric casts to explicit as they are causing data loss. I am not sure how much it would impact backwards-compatibility. The counter argument is the numeric to float casts being IMPLICIT. They are causing data loss for sure, but I believe there are different reasons to keep them as IMPLICIT.
On 03/09/18 12:05, Emre Hasegeli wrote: > In this case, I cannot see any other option than adding those as > separate cast functions. Should we mark this entry as "returned with > feedback"? > > We can also consider turning the current float to numeric casts to > explicit as they are causing data loss. I am not sure how much it > would impact backwards-compatibility. The counter argument is the > numeric to float casts being IMPLICIT. They are causing data loss for > sure, but I believe there are different reasons to keep them as > IMPLICIT. Thanks for the feedback. I will mark it RWF myself, as the backward- compatibility issues are kind of paralyzing, and I don't think I'll have time in this CF to give it enough further thought anyway. I wonder whether even changing a formerly-implicit cast to explicit would be too much of a behavior change for existing code that expects the current behavior? -Chap
Chapman Flack <chap@anastigmatix.net> writes: > I wonder whether even changing a formerly-implicit cast to explicit > would be too much of a behavior change for existing code that expects > the current behavior? We did exactly that during the 8.3 cycle, and got one heck of a lot of pushback about it, despite the existence of a lot of examples showing that it would be a good idea. I won't say we can't do it again, but it won't be an easy sell. regards, tom lane
Emre Hasegeli <emre@hasegeli.com> writes: > We can also consider turning the current float to numeric casts to > explicit as they are causing data loss. I am not sure how much it > would impact backwards-compatibility. The counter argument is the > numeric to float casts being IMPLICIT. They are causing data loss for > sure, but I believe there are different reasons to keep them as > IMPLICIT. FWIW, the behavior of these casts is intended to model what the SQL standard says about the behavior of "exact numeric" vs "approximate numeric" types. (In our implementation, the integral types plus numeric are in the first category, float4 and float8 in the second.) The spec is perfectly clear that you can assign an approximate numeric to an exact numeric, or vice versa, without any explicit cast. It is also clear that arithmetic operations combining approximate and exact are allowed without a cast, and yield approximate results. These rules lead to our conclusions that exact -> approximate is an implicit cast (so that the parser will choose eg. float8 multiplication over numeric multiplication if you write numericvar * float8var) while approximate -> exact is an assignment cast (so that you can assign float8 to numeric without explicit casting). If the decisions had been driven purely by "what risks silent precision loss", no doubt we'd have done it differently ... but it's hard to do that and still meet the letter of the spec. regards, tom lane