Thread: Re: [GENERAL] Trouble with float4 after upgrading from 6.5.3 to 7.0.2

Re: [GENERAL] Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Tom Lane
Date:
"Romanenko Mikhail" <mikhail@angg.ru> writes:
> testfloat=# update tftbl set f1=10.1 where f1=10 and f2=20;
> UPDATE 1
> testfloat=# update tftbl set f2=20.2 where f1=10.1 and f2=20;
> UPDATE 0

The second update is failing to find any tuple that satisfies f1 = 10.1,
because f1 is a float4 variable whereas 10.1 is implicitly a float8
constant.  6.5 also treated 10.1 as float8, but managed to find equality
anyway.

I think this change in behavior is my fault :-(.  About a year ago
I cleaned up some ugly coding in float.c and (without thinking about
it very carefully) changed float48eq and related operators so that
float4-vs-float8 comparisons are done in float8 arithmetic not float4.
The old code truncated the double input down to float and did a float
equality check, while the new code promotes the float input to double
and does a double-precision comparison.

This behavior is arguably more correct than the old way from a purely
mathematical point of view, but now that I think about it, it's not
clear that it's more useful than the old way.  In particular, in an
example like the above, it's now impossible for any float4 value to be
considered exactly equal to the float8 constant 10.1, because the float4
value just hasn't got the right low-order bits after widening.

Perhaps the old way of considering equality only to float accuracy
is more useful, even though it opens us up to problems like overflow
errors in "float4var = 1e100".  Comments anyone?

A general comment on your table design though: anyone who expects exact
equality tests on fractional float values to succeed is going to get
burnt sooner or later.  If you must use this style of coding then
I recommend using numeric fields not float fields, and certainly not
float4 fields.

            regards, tom lane

Re: Re: [GENERAL] Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 11:35 7/08/00 -0400, Tom Lane wrote:
>
>Perhaps the old way of considering equality only to float accuracy
>is more useful, even though it opens us up to problems like overflow
>errors in "float4var = 1e100".  Comments anyone?
>

The following frightened me a little:

pjw=# select float4(10.1);
 float4
--------
   10.1
(1 row)

pjw=# select float8(float4(10.1));
      float8
------------------
 10.1000003814697
(1 row)


I would have expected the latter to be at worst 10.10000000000000 +/-
.00000000000001.

Am I missing something?


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Re: [GENERAL] Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Thomas Lockhart
Date:
> Perhaps the old way of considering equality only to float accuracy
> is more useful, even though it opens us up to problems like overflow
> errors in "float4var = 1e100".  Comments anyone?

I would not have anticipated this either. I agree that downconverting to
float4 is the right solution.

Possible overflow errors can be checked in advance using the macros or
routines already there. This may be an example of why those could be A
Good Thing in some instances.

                    - Thomas

Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Thomas Lockhart
Date:
> I would have expected the latter to be at worst 10.10000000000000 +/-
> .00000000000001.
> Am I missing something?

Well, yes :)

10.1 can't be represented exactly, so the float8 representation has bits
set way down at the low end of the mantissa. When converting to float4
those low bits get rounded up or down into the lowest bit of the float4
representation. At that point, you have lost knowledge that this ever
was supposed to be *exactly* 10.1. And when converting back to float8,
that float4 low bit becomes a middle-range bit in the float8
representation, with all the bits underneath that zeroed.

Back in the old days, before printf() implementations settled down, you
would be reminded of this any time you did anything, since just
assigning 10.1 and then printing it out would give you some goofy
10.099999999998 or 10.10000000001 (don't count the number of digits here
too closely, they are only qualitatively correct).

                    - Thomas

Philip Warner <pjw@rhyme.com.au> writes:
> pjw=# select float8(float4(10.1));
>       float8
> ------------------
>  10.1000003814697
> (1 row)

> I would have expected the latter to be at worst 10.10000000000000 +/-
> .00000000000001.

float4 is good to about 7 decimal digits (24 mantissa bits) on
IEEE-standard machines.  Thus the above result is actually closer
than you have any right to expect.

Don't they teach people about float arithmetic in CS 101 anymore?

            regards, tom lane

Re: Re: [GENERAL] Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 12:11 7/08/00 -0400, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>> pjw=# select float8(float4(10.1));
>>       float8
>> ------------------
>>  10.1000003814697
>> (1 row)
>
>> I would have expected the latter to be at worst 10.10000000000000 +/-
>> .00000000000001.
>
>float4 is good to about 7 decimal digits (24 mantissa bits) on
>IEEE-standard machines.  Thus the above result is actually closer
>than you have any right to expect.
>
>Don't they teach people about float arithmetic in CS 101 anymore?
>

No idea. It's a couple of decades since I did it.

I wasn't complaining about the float4 accuracy; I was complaining about the
way it was converted to float8. It seems more intuitive to zero-extend base
10 zeros...




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 16:12 7/08/00 +0000, Thomas Lockhart wrote:
>> I would have expected the latter to be at worst 10.10000000000000 +/-
>> .00000000000001.
>> Am I missing something?
>
>10.1 can't be represented exactly, so the float8 representation has bits
>set way down at the low end of the mantissa. When converting to float4
>those low bits get rounded up or down into the lowest bit of the float4
>representation. At that point, you have lost knowledge that this ever
>was supposed to be *exactly* 10.1. And when converting back to float8,
>that float4 low bit becomes a middle-range bit in the float8
>representation, with all the bits underneath that zeroed.
>

Now I understand, but it doesn't quite make sense given what was displayed.
The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
that there was a level of either accuracy or display rouding happening.
When this value is converted to float8, I hoped that the result would be
the same as:

   Cast( Cast(f4val as varchar(32)) as float8)

Maybe this hope is naieve, but it it a lot more useful than the current
situation. But now that I understand what is happening, I see that (short
of varchar conversions!), it is probably quite hard to do since we can't
tell the 'correct' value.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> Now I understand, but it doesn't quite make sense given what was displayed.
> The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
> that there was a level of either accuracy or display rouding happening.

In float4-to-ASCII, yes.  Modern printf implementations have some
heuristics about the actual accuracy of float4 and float8 and where they
ought to round off the printed result accordingly.  But float4 to float8
is normally done just by appending zeroes to the mantissa.

I suppose we could implement the conversion as "float8in(float4out(x))"
instead of "(double) x" but it'd be several orders of magnitude slower,
as well as being *less* useful to those who know what they're doing with
float math (since the result would actually be a less-precise conversion
except in cases where the intended value has a short decimal
representation).

After thinking about it some more, I'm of the opinion that the current
behavior of float48eq and friends is the right thing, and that people
who expect 10.1 to be an exact value should be told to use type NUMERIC.
We should not kluge up the behavior of the float operations to try to
make it look like inexact values are exact.  That will just cause
failures in other situations.

            regards, tom lane

Re: Re: [GENERAL] Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Don Baccus
Date:
At 02:15 AM 8/8/00 +1000, Philip Warner wrote:

>I wasn't complaining about the float4 accuracy; I was complaining about the
>way it was converted to float8. It seems more intuitive to zero-extend base
>10 zeros...

You're right, it absolutely is more intuitive - in a base-10 representation,
i.e. NUMERIC.  A float4 is a binary number, and the only zeros available
are binary ones.  The same is true of the non-zeros, for that matter.  Not
that it matters, adding zero to the right of any number doesn't add
significance,
something they pound into people's heads in the physical sciences.

Doing a type conversion from float4 to float8 is in general not a safe thing
to do, because you only can depend on 24 bits of mantissa significance 
afterwards anyway.  One such conversion will propagate that lesser
significance
all throughout the expressions using it.  Take great care when you do this.

As Tom pointed out you're getting 8 digits of decimal significance in 
your example (10.100000) due to the particular number involved.  You can
only expect 24/log2(10) digits, which as he points out is just 7 digits plus
change.

The basic problem is that we evolved with 10 fingers, rather than 8 or 16 :)



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 12:53 7/08/00 -0400, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>> Now I understand, but it doesn't quite make sense given what was displayed.
>> The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
>> that there was a level of either accuracy or display rouding happening.
>
>I suppose we could implement the conversion as "float8in(float4out(x))"
>instead of "(double) x" but it'd be several orders of magnitude slower,
>as well as being *less* useful to those who know what they're doing with
>float math (since the result would actually be a less-precise conversion
>except in cases where the intended value has a short decimal
>representation).

Would I be right in saying that "those who know what they're doing with
float math" would totally avoid mixing float4 & float8? If so, then ISTM
that any changes to float4/8 conversions should not affect them.

There seem to be a few choices:

- go with zero-extending the bits. This is easy, and what would be expected
for normal float ops, at least by people who understand float implementations.

- do an intermediate text or numeric conversion. This will produce more
expected results, but at the expense of speed. If people complain about
speed, then they can change all float types to matching precision, or use
numeric data types.

- take the code from 'printf' or 'numeric' and do the appropriate
bit-munging to get the value to use in conversion. No idea if this would
work, but it is probably better than doing a text conversion since we won't
be at the mercy of the occasional C library that produces 10.1000001.

Would it be worth having some kind of DB setting for how it handles
float4/8 conversion? Or is it just too much work, when just using all
float8 or numeric is an acceptable workaround?

Do you know how fast 'numeric' is?


>That will just cause
>failures in other situations.

If there are genuine failures that would be introduced, then clearly it's a
bad idea. But, since it will only affect people who compare float4/8, it
doesn't seem too likely to produce worse failures than the change you have
already made. I ask this mainly out of curiosity - I assume there are more
aspects to this issue that I have missed...


Bye for now,

Philip.




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Don Baccus
Date:
At 01:37 PM 8/8/00 +1000, Philip Warner wrote:

>- do an intermediate text or numeric conversion. This will produce more
>expected results

By who?  I'm serious.  I sure wouldn't.  I can't think of any language
implementation that does this.  The standard approach has the advantage
of maintaining a defined significance.  The approach you suggest doesn't,
you're actually losing significance.  It gives the illusion of increasing
for the particular example you've chosen, but it is nothing but illusion.

>Would it be worth having some kind of DB setting for how it handles
>float4/8 conversion?

Use type numeric when you need precise decimal results.  Your suggested
kludge won't give you what you want.

>Do you know how fast 'numeric' is?

Not as fast as float by any means, but there's a reason why they exist 
in all languages which include the financial sphere in their presumed
application space.

The simplest thing is to realize that using float4 leaves you with
just over 7 significant digits, and to only print out 7 digits.
Then you'll get the answer you expect (10.100000).



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 05:50 8/08/00 -0700, Don Baccus wrote:
>
>The simplest thing is to realize that using float4 leaves you with
>just over 7 significant digits, and to only print out 7 digits.
>Then you'll get the answer you expect (10.100000).
>

You may have missed the point; my suggestions are only aimed at changing
the results of float4/float8 conversions & comparisons.

My (very vague) recollections of this stuff is that the machine
representation is only guaranteed to be within a certain machine/language
accuracy, so the stored value is within +/-(machine error) of the 'real
value'. Further, my recollection is that one or more bits are usually used
to provide rounding information so that, eg., the 7 digit representations
are consistent.

Given this, I have assumed that printf etc use these least significant bits
to determine the 'correct' representation. The idea is to do exactly the
same in converting float4 to float8, so that:

   '4.1'::float4 = '4.1'::float8

will be true.

Maybe my recollection is false...




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> Given this, I have assumed that printf etc use these least significant bits
> to determine the 'correct' representation.

No.  What float4-to-text really does is *discard* information, by
rounding off the printed result to only 7 digits (when there are
actually 7-and-change in there).  This means values that are actually
distinct float4 values may get printed as the same thing:

regression=# select 1.234567 :: float4;
 ?column?
----------
  1.23457
(1 row)

regression=# select 1.234568 :: float4;
 ?column?
----------
  1.23457
(1 row)

regression=# select 1.234567 :: float4 = 1.234568 :: float4;
 ?column?
----------
 f
(1 row)

regression=# select 1.234567 :: float4 - 1.234568 :: float4;
   ?column?
--------------
 -9.53674e-07
(1 row)

I don't much care for this behavior (since it means dump and reload of
float columns is lossy), and I certainly won't hold still for
introducing it into other operations on floats.

            regards, tom lane

Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Don Baccus
Date:
At 10:04 AM 8/8/00 -0400, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>> Given this, I have assumed that printf etc use these least significant bits
>> to determine the 'correct' representation.
>
>No.  What float4-to-text really does is *discard* information, by
>rounding off the printed result to only 7 digits (when there are
>actually 7-and-change in there).

Which is the standard approach for such conversion routines.  If you
keep generating digits they're just garbage anyway.

As far as the rest, Phil, it is true that well-designed floating-point
hardware such as that which follows the IEEE spec (well, at least
better-designed compared to most of its predecessors) strictly specifies
how extra rounding information is to be used during various mathematical
operations (add, multiply, etc).  This is done so that error due to rounding
can be strictly bounded.  The result (for float4) is a 24-bit mantissa
with strictly defined significance.

However, the float4 itself once stored only consists of that 24-bit 
mantissa.   There's no way to know the history of how that 24th bit
was generated, i.e. whether all the bits to the right were exactly
zero or whether it was the result of rounding (or truncation if the
user specified it and the hardware supports it).  

Kludging conversion by using decimal conversion will simply lose
significance.  In your 10.1 case you'll be happy because that 24th
bit becomes zero.

All you've accomplished, though, is to throw away (at least) one bit.
Your float8 now has no more than 23 bits of significance rather than
24.  Repeat this process a few times and you could store the result
in a boolean, in terms of the bits you could guarantee to be 
significant ...

>I don't much care for this behavior (since it means dump and reload of
>float columns is lossy),

A good reason for binary backup programs!

> and I certainly won't hold still for
>introducing it into other operations on floats.

No, it flies in the face of not only convention, but a lot of investigation
into how to implement floating point arithmetic in a way that's useful
to those who have to depend on the results having mathematically definable
error bounds.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 10:04 8/08/00 -0400, Tom Lane wrote:
>
>No.  What float4-to-text really does is *discard* information, by
>rounding off the printed result to only 7 digits (when there are
>actually 7-and-change in there).  This means values that are actually
>distinct float4 values may get printed as the same thing:
>

Thanks guys for some remarkably patient explanations. I now know more than
I want to know about float values.


>I don't much care for this behavior (since it means dump and reload of
>float columns is lossy), and I certainly won't hold still for
>introducing it into other operations on floats.

This makes me think that some kind of binary dump in pg_dump is probably
not a bad idea. Has anybody looked at doing a cross-platform binary COPY?
Or some other way of representing base types - we have <type>in/out maybe
<type>exp/imp (export/import) might be useful to get a portable, lossless
representation.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Re: [GENERAL] Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Don Baccus
Date:
At 01:40 AM 8/9/00 +1000, Philip Warner wrote:

>This makes me think that some kind of binary dump in pg_dump is probably
>not a bad idea. Has anybody looked at doing a cross-platform binary COPY?
>Or some other way of representing base types - we have <type>in/out maybe
><type>exp/imp (export/import) might be useful to get a portable, lossless
>representation.

Another way to do it is to dump/restore floats in hex, maintaining the
actual binary values.

Conversion to hex, unlike conversion to decimal, is exact (16 is a power
of 2 while 10 is not, to add to your "more knowledge than you want" about
floats!)



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.