Re: Ryu floating point output patch - Mailing list pgsql-hackers

From Andrew Gierth
Subject Re: Ryu floating point output patch
Date
Msg-id 871s5emitx.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: Ryu floating point output patch  (Andres Freund <andres@anarazel.de>)
Responses Re: Ryu floating point output patch  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Ryu floating point output patch  (Donald Dong <xdong@csumb.edu>)
List pgsql-hackers
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:

 >> In particular, how does it know how every strtod() on the planet
 >> will react to specific input?

 Andres> strtod()'s job ought to computationally be significantly easier
 Andres> than the other way round, no? And if there's buggy strtod()
 Andres> implementations out there, why would they be guaranteed to do
 Andres> the correct thing with our current output, but not with ryu's?

Funny thing: I've been devoting considerable effort to testing this, and
the one failure mode I've found is very interesting; it's not a problem
with strtod(), in fact it's a bug in our float4in caused by _misuse_ of
strtod().

In particular, float4in thinks it's ok to do strtod() on the input, and
then round the result to a float. It is wrong to think that, and here's
why:

Consider the (float4) bit pattern x15ae43fd. The true decimal value of
this, and its adjacent values are:

x15ae43fc = 7.0385300755536269169437150392273653469292493678466371420654468238353729248046875e-26
  midpoint  7.0385303837024180189014515281838361605176203339429008565275580622255802154541015625e-28
x15ae43fd = 7.038530691851209120859188017140306974105991300039164570989669300615787506103515625e-26
  midpoint  7.0385310000000002228169245060967777876943622661354282854517805390059947967529296875e-26
x15ae43fe = 7.03853130814879132477466099505324860128273323223169199991389177739620208740234375e-26

Now look at what happens if you pass '7.038531e-26' as input for float4.

From the above values, the correct result is x15ae43fd, because any
other value would be strictly further away. But if you input the value
as a double first, here are the adjacent representations:

x3ab5c87fafffffff = 7.038530999999999074873222531206633286975087635036133540...e-26
         midpoint   7.0385309999999996488450735186517055373347249505857809130...e-28
x3ab5c87fb0000000 = 7.03853100000000022281692450609677778769436226613542828545...e-28
         midpoint   7.03853100000000079678877549354185003805399958168507565784...e-28
x3ab5c87fb0000001 = 7.03853100000000137076062648098692228841363689723472303024...e-28

So the double value is x3ab5c87fb0000000, which has a mantissa of
(1).01011100100001111111101100000...

which when rounded to 23 bits (excluding the implied (1)), is
010 1110 0100 0011 1111 1110  = 2e43fe

So the rounded result is incorrectly x15ae43fe, rather than the expected
correct value of x15ae43fd.

strtof() exists as part of the relevant standards, so float4in should be
using that in place of strtod, and that seems to fix this case for me.

-- 
Andrew (irc:RhodiumToad)


pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Synchronous replay take III
Next
From: Masahiko Sawada
Date:
Subject: Re: Log a sample of transactions