Thread: BUG #15519: Casting float4 into int4 gets the wrong sign instead of"integer out of range" error

The following bug has been logged on the website:

Bug reference:      15519
Logged by:          Victor Petrovykh
Email address:      victor@magic.io
PostgreSQL version: 10.5
Operating system:   Gentoo Linux 4.14.20
Description:

Offending examples:
SELECT ((2147483647::float4) - 1.0::float4)::int4;
SELECT ((2147483590::float4) - 1.0::float4)::int4;
SELECT ((2147483647::float4) + 1.0::float4)::int4;

They all produce the same result: -2147483648

I understand that a float4 cannot represent large integers with the same
precision as int4, that's OK. What surprised me is that instead of getting
an "overflow error" or "integer out of range" I simply got a negative result
for a value that is actually close to maximum int4. To contrast this, the
query:
SELECT ((2147483647::float4) + 200.0::float4)::int4;
The above produces the expected "ERROR:  integer out of range"


>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:

 PG> The following bug has been logged on the website:
 PG> Bug reference:      15519
 PG> Logged by:          Victor Petrovykh
 PG> Email address:      victor@magic.io
 PG> PostgreSQL version: 10.5
 PG> Operating system:   Gentoo Linux 4.14.20
 PG> Description:        

 PG> Offending examples:
 PG> SELECT ((2147483647::float4) - 1.0::float4)::int4;
 PG> SELECT ((2147483590::float4) - 1.0::float4)::int4;
 PG> SELECT ((2147483647::float4) + 1.0::float4)::int4;

 PG> They all produce the same result: -2147483648

The bug here is that the ftoi4 conversion function thinks it can do
this:

    if (num < INT_MIN || num > INT_MAX || isnan(num))

but this causes INT_MIN and INT_MAX to be implicitly converted to floats
(not doubles), which can't represent their values exactly and ends up
with the conditions being false in the edge cases. (Specifically,
(float)INT_MAX is a bit larger than INT_MAX.)

Probably this should be changed to use ((float8) INT_MAX) etc. instead?

 PG> I understand that a float4 cannot represent large integers with the
 PG> same precision as int4, that's OK. What surprised me is that
 PG> instead of getting an "overflow error" or "integer out of range" I
 PG> simply got a negative result for a value that is actually close to
 PG> maximum int4. To contrast this, the query:

 PG> SELECT ((2147483647::float4) + 200.0::float4)::int4;
 PG> The above produces the expected "ERROR:  integer out of range"

Likewise you also get that error if you cast the float4 to a float8
before casting to integer.

-- 
Andrew (irc:RhodiumToad)


=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> Offending examples:
> SELECT ((2147483647::float4) - 1.0::float4)::int4;
> SELECT ((2147483590::float4) - 1.0::float4)::int4;
> SELECT ((2147483647::float4) + 1.0::float4)::int4;

> They all produce the same result: -2147483648

Huh, interesting.  The code underlying this looks sane enough
at first glance:

    float4        num = PG_GETARG_FLOAT4(0);

    if (num < INT_MIN || num > INT_MAX || isnan(num))
        ereport(ERROR,
                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
                 errmsg("integer out of range")));

    PG_RETURN_INT32((int32) rint(num));

I think what is happening is that the compiler is interpreting
"num > INT_MAX" as something to be done in float4 arithmetic,
so it computes (float4) INT_MAX which rounds off to be the
equivalent of exactly 2147483648.  Then the problematic input
values, which all also round to 2147483648, get past the if-test
and result in undetected overflow in the cast to int32.

Perhaps we could fix this by writing

    if (num < (float8) INT_MIN || num > (float8) INT_MAX || isnan(num))

but I don't have a huge amount of confidence in that either, especially
not for the similar coding in ftoi8 and dtoi8, where the comparison values
are large enough that they'd not be exactly represented by float8 either.
Maybe we need an explicit check for the integer result being of the wrong
sign, to catch just-barely-overflowing cases like these.

            regards, tom lane


Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:
>  PG> Offending examples:
>  PG> SELECT ((2147483647::float4) - 1.0::float4)::int4;
>  PG> SELECT ((2147483590::float4) - 1.0::float4)::int4;
>  PG> SELECT ((2147483647::float4) + 1.0::float4)::int4;
>  PG> They all produce the same result: -2147483648

> The bug here is that the ftoi4 conversion function thinks it can do
> this:
>     if (num < INT_MIN || num > INT_MAX || isnan(num))
> Probably this should be changed to use ((float8) INT_MAX) etc. instead?

After thinking about this for awhile, there's another class of undesirable
behavior here, which can be illustrated by

regression=# select '32766.1'::float4::int2;
 int2
-------
 32766
(1 row)

regression=# select '32767.1'::float4;
 float4
---------
 32767.1
(1 row)

regression=# select '32767.1'::float4::int2;
ERROR:  smallint out of range

It doesn't seem very nice that that fails rather than producing 32767.

I think we could fix that by doing the rint() first, before the
comparisons, which should be safe enough: as the Linux man page for it
points out, it really can't ever fail.

I'm inclined to write ftoi4 as

    /*
     * Get rid of any fractional part in the input.  This is so we don't
     * fail on just-out-of-range values that would round into range.
     */
    num = rint(num);

    /*
     * Range check.  We must be careful here that the boundary values are
     * expressed exactly in the appropriate float domain; we assume float4
     * is going to round off INT_MAX to a power of 2.  Also note assumption
     * that rint() will pass through a NaN or Inf unchanged.
     */
     if (unlikely(num < (float4) INT_MIN || num >= (float4) INT_MAX || isnan(num)))
        ereport(...);

    PG_RETURN_INT32((int32) num);

We could use the same pattern for the other five functions at issue,
but "num >=" would be "num >" in cases where we expect the comparison
value to be represented without roundoff.

Obviously, some regression test cases to verify all these assumptions
would be a good idea.

            regards, tom lane


>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom>     /*
 Tom>      * Range check.  We must be careful here that the boundary values are
 Tom>      * expressed exactly in the appropriate float domain; we assume float4
 Tom>      * is going to round off INT_MAX to a power of 2.  Also note assumption
 Tom>      * that rint() will pass through a NaN or Inf unchanged.
 Tom>      */
 Tom>      if (unlikely(num < (float4) INT_MIN || num >= (float4) INT_MAX || isnan(num)))
 Tom>         ereport(...);

Looking around, another approach I've seen (which I like better) is to
use

    if (num < (float4)INT_MIN || num >= -(float4)INT_MIN || ...

which doesn't rely on assumptions about how the compiler is going to
round INT_MAX. (It does rely on two's complement representation of ints,
but that assumption is known to be safe.)

-- 
Andrew (irc:RhodiumToad)


Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom>      if (unlikely(num < (float4) INT_MIN || num >= (float4) INT_MAX || isnan(num)))

>     if (num < (float4)INT_MIN || num >= -(float4)INT_MIN || ...

Meh.  Seems to me that's relying on pretty much the same assumptions
and throwing in an extra dollop of obscurantism on top.

            regards, tom lane


I wrote:
> Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>> if (num < (float4)INT_MIN || num >= -(float4)INT_MIN || ...

> Meh.  Seems to me that's relying on pretty much the same assumptions
> and throwing in an extra dollop of obscurantism on top.

mmm ... but on the other hand, it means we don't need to write the
test differently depending on whether we think INTxx_MAX will get
rounded.  So that's worth something.

            regards, tom lane


>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> if (unlikely(num < (float4) INT_MIN || num >= (float4) INT_MAX || isnan(num)))

 >> if (num < (float4)INT_MIN || num >= -(float4)INT_MIN || ...

 Tom> Meh. Seems to me that's relying on pretty much the same
 Tom> assumptions

No, because we know that INT_MIN is always exactly representable as a
float (whereas INT_MAX is not), and therefore the cast result will not
depend on any rounding choices whether at compile or run time. Nor does
it depend on knowing that float4 can't represent INT_MAX exactly.

-- 
Andrew (irc:RhodiumToad)


I'm sorry I'm kinda new here.

Am I missing something in thinking that the cast of a float to int should produce an error if the sign of the original value doesn't match the sign of the cast result? Presumably you have access to both values at some point and checking sign bits for sameness should be simple. 

Also, I would say that doing some float4 arithmetic and then casting into int4 or int8 should be consistent, so if

SELECT (54610::float4 * 39324::float4)::int8;

gets me 2147483648, then 

SELECT (54610::float4 * 39324::float4)::int4;

should be an error rather than the unintuitive -2147483648.

The scenario that I'm pointing at is basically first doing some float4 arithmetic, casting the result into an int4 and expecting that the result (if it was without error) is actually about as close to the original float value as float4 precision would normally allow.

On Fri, Nov 23, 2018 at 7:45 PM Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> if (unlikely(num < (float4) INT_MIN || num >= (float4) INT_MAX || isnan(num)))

 >> if (num < (float4)INT_MIN || num >= -(float4)INT_MIN || ...

 Tom> Meh. Seems to me that's relying on pretty much the same
 Tom> assumptions

No, because we know that INT_MIN is always exactly representable as a
float (whereas INT_MAX is not), and therefore the cast result will not
depend on any rounding choices whether at compile or run time. Nor does
it depend on knowing that float4 can't represent INT_MAX exactly.

--
Andrew (irc:RhodiumToad)
Victor Petrovykh <victor@magic.io> writes:
> Am I missing something in thinking that the cast of a float to int should
> produce an error if the sign of the original value doesn't match the sign
> of the cast result?

Well, yeah, our traditional way of coding this overflow check would
probably have compared the sign of the result to the sign of the input,
but (a) we'd still have needed some ad-hoc range check to avoid getting
fooled by inputs large enough to have wrapped around an even number of
times, and (b) this approach depends on the compiler not taking any
liberties based on an assumption that the program doesn't provoke
integer overflow.  (b) gets more worrisome with each year that goes by,
because the compiler guys keep finding ever-more-creative ways to break
your code if it violates C-spec semantics.  So we really want to write
a test that will fail exactly when the integer coercion would overflow,
not do the coercion and then check to see if it overflowed.

            regards, tom lane


Thanks for the clarification.

I was thinking about the assumption whether or not INT_MAX is representable perfectly as a float. Consider any combination of types int_X and float_Y where X and Y denote the number of bits used for representation:
- for any type int_X it is always true that its maximum value cannot be represented by a float_Y if X >= Y because at least one bit of the float_Y must be used to represent the exponent part of the float (else the float_Y is indistinguishable from int_Y)
- for any type int_X it is always possible to represent the maximum value (and by extension any other value) exactly as a float_Y if X <= M(Y), meaning that if the mantissa of the float_Y has at least as many bits as X. In practice X and Y will be some form of 8, 16, 32, 64, etc. so 2X <= Y and unless the mantissa is less than half of the significant bits of the float we will have x <= M(Y) for any practical X < Y
- for any int_X and float_Y there exists a specific value FLOATY, such that for all possible float_Y values y < FLOATY it is also true that y <= MAX_INT_X and for all y >= FLOATY it is also true that y > MAX_INT (basically I can always pick a float threshold above which all numbers would be above MAX_INT and therefore out of range and below it would be guaranteed to be representable as ints). I conjecture that this special value FLOATY = (float_Y)MAX_INT_X for any X >= Y. I think I can formally prove this conjecture, basically it has to do with sparseness of float values when compared to MAX_INT_X + 1 and MAX_INT_X - 1.

So it seems to me that the rule for casting would depend on whether the float has same or fewer bits than the int or not:
- when the int has same or more bits (e.g. float4 -> int4 or float4 -> int8)

  if (num < INT_MIN || num >= INT_MAX || isnan(num))

- when the int has fewer bits (e.g. float4 -> int2 or float8 -> int4)

  if (num < INT_MIN || num > INT_MAX || isnan(num))

I don't think I make any unsafe architecture assumptions in the above. As far as I can tell the only such assumption is about likely values of X and Y being 8, 16, 32, 64, 128, etc. and that the mantissa of a float is using more than half of its bits. I'm assuming that casting function knows the 2 actual types so it can make a deterministic decision about the comparison it must use.

I have one more related question: is a fix for this likely to appear in the next Postgres release?


On Fri, Nov 23, 2018 at 8:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Victor Petrovykh <victor@magic.io> writes:
> Am I missing something in thinking that the cast of a float to int should
> produce an error if the sign of the original value doesn't match the sign
> of the cast result?

Well, yeah, our traditional way of coding this overflow check would
probably have compared the sign of the result to the sign of the input,
but (a) we'd still have needed some ad-hoc range check to avoid getting
fooled by inputs large enough to have wrapped around an even number of
times, and (b) this approach depends on the compiler not taking any
liberties based on an assumption that the program doesn't provoke
integer overflow.  (b) gets more worrisome with each year that goes by,
because the compiler guys keep finding ever-more-creative ways to break
your code if it violates C-spec semantics.  So we really want to write
a test that will fail exactly when the integer coercion would overflow,
not do the coercion and then check to see if it overflowed.

                        regards, tom lane
Victor Petrovykh <victor@magic.io> writes:
> I have one more related question: is a fix for this likely to appear in the
> next Postgres release?

Yes, it was pushed last week.

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=e473e1f2b

            regards, tom lane