Thread: Trouble with float4 after upgrading from 6.5.3 to 7.0.2

Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
"Romanenko Mikhail"
Date:
After trying to upgrade PostgreSQL from 6.5.3 to 7.0.2 I got into trouble with float4. I'll try to explain it by
example.

postgres@lee: ~$ createdb -E LATIN1 -e testfloat
CREATE DATABASE "testfloat" WITH ENCODING = 'LATIN1'
CREATE DATABASE
postgres@lee: ~$ psql testfloat
testfloat=# create table tftbl (f1 float4, f2 float4);
CREATE
testfloat=# insert into tftbl values (10, 20);
INSERT 212682 1
testfloat=# select * from tftbl;
 f1 | f2
----+----
 10 | 20
(1 row)

testfloat=# update tftbl set f1=10.1 where f1=10 and f2=20;
UPDATE 1
testfloat=# update tftbl set f2=20.2 where f1=10.1 and f2=20;
UPDATE 0
testfloat=# select * from tftbl;
  f1  | f2
------+----
 10.1 | 20
(1 row)

testfloat=# update tftbl set f2=20.2 where f1=float4(10.1) and f2=20;
UPDATE 1
testfloat=# select * from tftbl;
  f1  |  f2
------+------
 10.1 | 20.2
(1 row)

In my real client application (Windows 98, Borland C++ Builder 5.0, BDE 5.1.1.1, PostODBC 06.50.0000) I cannot in all
casesuse expressions like f1=float4(10.1) instead of simple f1=10.1 because BDE and PostODBC construct queries by
themselveswhen I, for example, update tables from BCB components (BDE complains, that another user changed the record,
whileI am the only user at the time). They use f1=10.1-like format. With PostgeSQL 6.5.3 I have no problem of the kind. 

PostgreSQL lives in Debian Linux (woody), kernel 2.2.14, libc6 2.1.3, locales 2.1.3, here is output of locale:
lee:~# locale
LANG=C
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=

Any help, tip or hint would be appreciated.

Thank you, Mikhail.





Re: Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Tom Lane
Date:
"Romanenko Mikhail" <mikhail@angg.ru> writes:
> testfloat=# update tftbl set f1=10.1 where f1=10 and f2=20;
> UPDATE 1
> testfloat=# update tftbl set f2=20.2 where f1=10.1 and f2=20;
> UPDATE 0

The second update is failing to find any tuple that satisfies f1 = 10.1,
because f1 is a float4 variable whereas 10.1 is implicitly a float8
constant.  6.5 also treated 10.1 as float8, but managed to find equality
anyway.

I think this change in behavior is my fault :-(.  About a year ago
I cleaned up some ugly coding in float.c and (without thinking about
it very carefully) changed float48eq and related operators so that
float4-vs-float8 comparisons are done in float8 arithmetic not float4.
The old code truncated the double input down to float and did a float
equality check, while the new code promotes the float input to double
and does a double-precision comparison.

This behavior is arguably more correct than the old way from a purely
mathematical point of view, but now that I think about it, it's not
clear that it's more useful than the old way.  In particular, in an
example like the above, it's now impossible for any float4 value to be
considered exactly equal to the float8 constant 10.1, because the float4
value just hasn't got the right low-order bits after widening.

Perhaps the old way of considering equality only to float accuracy
is more useful, even though it opens us up to problems like overflow
errors in "float4var = 1e100".  Comments anyone?

A general comment on your table design though: anyone who expects exact
equality tests on fractional float values to succeed is going to get
burnt sooner or later.  If you must use this style of coding then
I recommend using numeric fields not float fields, and certainly not
float4 fields.

            regards, tom lane

Re: [HACKERS] Re: Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 11:35 7/08/00 -0400, Tom Lane wrote:
>
>Perhaps the old way of considering equality only to float accuracy
>is more useful, even though it opens us up to problems like overflow
>errors in "float4var = 1e100".  Comments anyone?
>

The following frightened me a little:

pjw=# select float4(10.1);
 float4
--------
   10.1
(1 row)

pjw=# select float8(float4(10.1));
      float8
------------------
 10.1000003814697
(1 row)


I would have expected the latter to be at worst 10.10000000000000 +/-
.00000000000001.

Am I missing something?


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: [HACKERS] Re: Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Thomas Lockhart
Date:
> Perhaps the old way of considering equality only to float accuracy
> is more useful, even though it opens us up to problems like overflow
> errors in "float4var = 1e100".  Comments anyone?

I would not have anticipated this either. I agree that downconverting to
float4 is the right solution.

Possible overflow errors can be checked in advance using the macros or
routines already there. This may be an example of why those could be A
Good Thing in some instances.

                    - Thomas

Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Thomas Lockhart
Date:
> I would have expected the latter to be at worst 10.10000000000000 +/-
> .00000000000001.
> Am I missing something?

Well, yes :)

10.1 can't be represented exactly, so the float8 representation has bits
set way down at the low end of the mantissa. When converting to float4
those low bits get rounded up or down into the lowest bit of the float4
representation. At that point, you have lost knowledge that this ever
was supposed to be *exactly* 10.1. And when converting back to float8,
that float4 low bit becomes a middle-range bit in the float8
representation, with all the bits underneath that zeroed.

Back in the old days, before printf() implementations settled down, you
would be reminded of this any time you did anything, since just
assigning 10.1 and then printing it out would give you some goofy
10.099999999998 or 10.10000000001 (don't count the number of digits here
too closely, they are only qualitatively correct).

                    - Thomas

Philip Warner <pjw@rhyme.com.au> writes:
> pjw=# select float8(float4(10.1));
>       float8
> ------------------
>  10.1000003814697
> (1 row)

> I would have expected the latter to be at worst 10.10000000000000 +/-
> .00000000000001.

float4 is good to about 7 decimal digits (24 mantissa bits) on
IEEE-standard machines.  Thus the above result is actually closer
than you have any right to expect.

Don't they teach people about float arithmetic in CS 101 anymore?

            regards, tom lane

Re: [HACKERS] Re: Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 12:11 7/08/00 -0400, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>> pjw=# select float8(float4(10.1));
>>       float8
>> ------------------
>>  10.1000003814697
>> (1 row)
>
>> I would have expected the latter to be at worst 10.10000000000000 +/-
>> .00000000000001.
>
>float4 is good to about 7 decimal digits (24 mantissa bits) on
>IEEE-standard machines.  Thus the above result is actually closer
>than you have any right to expect.
>
>Don't they teach people about float arithmetic in CS 101 anymore?
>

No idea. It's a couple of decades since I did it.

I wasn't complaining about the float4 accuracy; I was complaining about the
way it was converted to float8. It seems more intuitive to zero-extend base
10 zeros...




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 16:12 7/08/00 +0000, Thomas Lockhart wrote:
>> I would have expected the latter to be at worst 10.10000000000000 +/-
>> .00000000000001.
>> Am I missing something?
>
>10.1 can't be represented exactly, so the float8 representation has bits
>set way down at the low end of the mantissa. When converting to float4
>those low bits get rounded up or down into the lowest bit of the float4
>representation. At that point, you have lost knowledge that this ever
>was supposed to be *exactly* 10.1. And when converting back to float8,
>that float4 low bit becomes a middle-range bit in the float8
>representation, with all the bits underneath that zeroed.
>

Now I understand, but it doesn't quite make sense given what was displayed.
The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
that there was a level of either accuracy or display rouding happening.
When this value is converted to float8, I hoped that the result would be
the same as:

   Cast( Cast(f4val as varchar(32)) as float8)

Maybe this hope is naieve, but it it a lot more useful than the current
situation. But now that I understand what is happening, I see that (short
of varchar conversions!), it is probably quite hard to do since we can't
tell the 'correct' value.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> Now I understand, but it doesn't quite make sense given what was displayed.
> The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
> that there was a level of either accuracy or display rouding happening.

In float4-to-ASCII, yes.  Modern printf implementations have some
heuristics about the actual accuracy of float4 and float8 and where they
ought to round off the printed result accordingly.  But float4 to float8
is normally done just by appending zeroes to the mantissa.

I suppose we could implement the conversion as "float8in(float4out(x))"
instead of "(double) x" but it'd be several orders of magnitude slower,
as well as being *less* useful to those who know what they're doing with
float math (since the result would actually be a less-precise conversion
except in cases where the intended value has a short decimal
representation).

After thinking about it some more, I'm of the opinion that the current
behavior of float48eq and friends is the right thing, and that people
who expect 10.1 to be an exact value should be told to use type NUMERIC.
We should not kluge up the behavior of the float operations to try to
make it look like inexact values are exact.  That will just cause
failures in other situations.

            regards, tom lane

> I suppose we could implement the conversion as "float8in(float4out(x))"
> instead of "(double) x" but it'd be several orders of magnitude slower,
> as well as being *less* useful to those who know what they're doing with
> float math (since the result would actually be a less-precise conversion
> except in cases where the intended value has a short decimal
> representation).

We only need to maintain the lower-order bit(s). Seems this could be done
a lot easier than by an ascii in-between.

Is there a reason we can't perform the conversion and then copy the
low-order bits manually, with some bit-shifting and masking?

Ian


Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Tom Lane
Date:
<vectro@pipeline.com> writes:
> Is there a reason we can't perform the conversion and then copy the
> low-order bits manually, with some bit-shifting and masking?

*What* low-order bits?  The fundamental problem is we don't have 'em.

            regards, tom lane

> > Is there a reason we can't perform the conversion and then copy the
> > low-order bits manually, with some bit-shifting and masking?
>
> *What* low-order bits?  The fundamental problem is we don't have 'em.

OK, we represent 10.1 (decimal) with 1010.000110011..., repeating the 0011
pattern. In floating point representation, we say this is 2^3 *
1.010000110011..., repeating the 0011 pattern.

But this dosen't work very well, because when we print it out it won't be
exact, it will be some decimal number ending in a 0 or a 5. So when we
read this out in decimal, we get 10.099999995, with a variable number of
9's depending on the precision we use.

When we print this number out, we perform the decimal conversion, and then
truncate the last decimal digit and round.

So I guess the question is, why can't we perform 4-bit float -> 8-bit
float conversion via a decimal conversion, irrespective of storing the
thing in ASCII.

Mabye the decimal conversion is too costly? Perhaps we could 1) mark 4-bit
floats with a flag of some kind indicating whether or not the decimal
conversion is necessary, and 2) avoid this conversion wherever possible,
including giving people a warning when they use float4s in their tuples.

Or mabye I'm just being dumb.

Ian




Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 12:53 7/08/00 -0400, Tom Lane wrote:
>Philip Warner <pjw@rhyme.com.au> writes:
>> Now I understand, but it doesn't quite make sense given what was displayed.
>> The float4 value is *displayed* as 10.1, not 10.1000001, so I had assumed
>> that there was a level of either accuracy or display rouding happening.
>
>I suppose we could implement the conversion as "float8in(float4out(x))"
>instead of "(double) x" but it'd be several orders of magnitude slower,
>as well as being *less* useful to those who know what they're doing with
>float math (since the result would actually be a less-precise conversion
>except in cases where the intended value has a short decimal
>representation).

Would I be right in saying that "those who know what they're doing with
float math" would totally avoid mixing float4 & float8? If so, then ISTM
that any changes to float4/8 conversions should not affect them.

There seem to be a few choices:

- go with zero-extending the bits. This is easy, and what would be expected
for normal float ops, at least by people who understand float implementations.

- do an intermediate text or numeric conversion. This will produce more
expected results, but at the expense of speed. If people complain about
speed, then they can change all float types to matching precision, or use
numeric data types.

- take the code from 'printf' or 'numeric' and do the appropriate
bit-munging to get the value to use in conversion. No idea if this would
work, but it is probably better than doing a text conversion since we won't
be at the mercy of the occasional C library that produces 10.1000001.

Would it be worth having some kind of DB setting for how it handles
float4/8 conversion? Or is it just too much work, when just using all
float8 or numeric is an acceptable workaround?

Do you know how fast 'numeric' is?


>That will just cause
>failures in other situations.

If there are genuine failures that would be introduced, then clearly it's a
bad idea. But, since it will only affect people who compare float4/8, it
doesn't seem too likely to produce worse failures than the change you have
already made. I ask this mainly out of curiosity - I assume there are more
aspects to this issue that I have missed...


Bye for now,

Philip.




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: Trouble with float4 after upgrading from 6.5.3 to 7.0.2

From
"Romanenko Mikhail"
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> A general comment on your table design though: anyone who expects exact
> equality tests on fractional float values to succeed is going to get
> burnt sooner or later.  If you must use this style of coding then
> I recommend using numeric fields not float fields, and certainly not
> float4 fields.
>
> regards, tom lane

I try 'numeric', 'decimal', and 'float8' types and only 'float8' works. Both 'decimal' and 'numeric' failed (as
'float4'did) with error message: "Unable to identify an operator '=' for type numeric and 'float8' You will have to
retypethis query using an explicite cast". 

And though my problem is solved, thank you for the help, I thought perhaps this information will be usefull for you.

Mikhail.


Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 05:50 8/08/00 -0700, Don Baccus wrote:
>
>The simplest thing is to realize that using float4 leaves you with
>just over 7 significant digits, and to only print out 7 digits.
>Then you'll get the answer you expect (10.100000).
>

You may have missed the point; my suggestions are only aimed at changing
the results of float4/float8 conversions & comparisons.

My (very vague) recollections of this stuff is that the machine
representation is only guaranteed to be within a certain machine/language
accuracy, so the stored value is within +/-(machine error) of the 'real
value'. Further, my recollection is that one or more bits are usually used
to provide rounding information so that, eg., the 7 digit representations
are consistent.

Given this, I have assumed that printf etc use these least significant bits
to determine the 'correct' representation. The idea is to do exactly the
same in converting float4 to float8, so that:

   '4.1'::float4 = '4.1'::float8

will be true.

Maybe my recollection is false...




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Tom Lane
Date:
Philip Warner <pjw@rhyme.com.au> writes:
> Given this, I have assumed that printf etc use these least significant bits
> to determine the 'correct' representation.

No.  What float4-to-text really does is *discard* information, by
rounding off the printed result to only 7 digits (when there are
actually 7-and-change in there).  This means values that are actually
distinct float4 values may get printed as the same thing:

regression=# select 1.234567 :: float4;
 ?column?
----------
  1.23457
(1 row)

regression=# select 1.234568 :: float4;
 ?column?
----------
  1.23457
(1 row)

regression=# select 1.234567 :: float4 = 1.234568 :: float4;
 ?column?
----------
 f
(1 row)

regression=# select 1.234567 :: float4 - 1.234568 :: float4;
   ?column?
--------------
 -9.53674e-07
(1 row)

I don't much care for this behavior (since it means dump and reload of
float columns is lossy), and I certainly won't hold still for
introducing it into other operations on floats.

            regards, tom lane

Re: [HACKERS] Re: Trouble with float4 afterupgrading from 6.5.3 to 7.0.2

From
Philip Warner
Date:
At 10:04 8/08/00 -0400, Tom Lane wrote:
>
>No.  What float4-to-text really does is *discard* information, by
>rounding off the printed result to only 7 digits (when there are
>actually 7-and-change in there).  This means values that are actually
>distinct float4 values may get printed as the same thing:
>

Thanks guys for some remarkably patient explanations. I now know more than
I want to know about float values.


>I don't much care for this behavior (since it means dump and reload of
>float columns is lossy), and I certainly won't hold still for
>introducing it into other operations on floats.

This makes me think that some kind of binary dump in pg_dump is probably
not a bad idea. Has anybody looked at doing a cross-platform binary COPY?
Or some other way of representing base types - we have <type>in/out maybe
<type>exp/imp (export/import) might be useful to get a portable, lossless
representation.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/