[RFC] Fix div/mul crash and more undefined behavior - Mailing list pgsql-hackers

From Xi Wang
Subject [RFC] Fix div/mul crash and more undefined behavior
Date
Msg-id 50A96E27.6000404@gmail.com
Whole thread Raw
Responses Re: [RFC] Fix div/mul crash and more undefined behavior  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
The INT_MIN / -1 crash problem was partially addressed in 2006 and
commit 9fc6f4e1ae107f44807c4906105e1f7eb052ecb1.

http://archives.postgresql.org/pgsql-patches/2006-06/msg00102.php

However, the fix is incomplete and incorrect for some cases.

64-bit crash
============

Below is an example that crashes PostgreSQL on Windows, using the
64-bit binary from http://www.postgresql.org/download/windows/.
SELECT ((-9223372036854775808)::int8) % (-1);

Note that the previous discussion concluded that int8 (64-bit)
division didn't crash, which is incorrect.

http://archives.postgresql.org/pgsql-patches/2006-06/msg00104.php

My guess is that the previous test was carried out on a 32-bit
Windows, where there's no 64-bit division instruction.  In that
case, the generated code calls a runtime function (e.g., lldiv),
which doesn't trap.  However, on a 64-bit system, the compiler
generates the idivq instruction, which leads to a crash.

Note that the problem is not limited to division.  The following
multiplication crashes as well:
SELECT ((-9223372036854775808)::int8) * ((-1)::int8);

The reason is that the multiplication overflow check uses a division.

32-bit crash
============

The previous fix doesn't fix all possible crashes, even on a 32-bit
Windows.  Below is an example to crash a 32-bit PostgreSQL:
       SELECT ((-2147483648)::int4) % ((-1)::int2);

Portability
===========

The previous fix uses #ifdef WIN32 ... #endif, which is not portable,
as noted below:

http://archives.postgresql.org/pgsql-patches/2006-06/msg00103.php

Using a SIGFPE handler is also dangerous (e.g., causing an infinite
loop sometimes):


https://www.securecoding.cert.org/confluence/display/seccode/SIG35-C.+Do+not+return+from+SIGSEGV,+SIGILL,+or+SIGFPE+signal+handlers

Strictly speaking, using postcondition checking to detect signed
division overflows (actually, all signed integer overflows) violates
the C language standard, because signed integer overflow is undefined
behavior in C and we cannot first compute the result and then check
for overflow.  Compilers can do a lot of funny and crazy things
(e.g., generating traps or even optimizing away overflow checks).

BTW, PostgreSQL currently uses gcc's workaround option -fwrapv to
disable offending optimizations based on integer overflows.  The
problems are: (1) it doesn't always work (e.g., for division), and
(2) many other C compilers do not even support this workaround
option and can perform offending optimizations.

Patching
========

Below is a patch that fixes division crashes.  It removes #ifdef WIN32
and tries to use portable checks.  I'll send more (e.g., for fixing
multiplication crashes) if this looks good.

I understand that the existing integer overflow checks tried to
avoid dependency on constants like INT64_MIN.  But I'm not sure how
to perform simpler and portable overflow checks without using such
constants.

Also, I could use the SHRT_MIN rather than introducing INT16_MIN; I just
feel like using INT16_MIN with int16 is clearer and less confusing.

diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 9f752ed..d7867cb 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -732,30 +732,18 @@ int4div(PG_FUNCTION_ARGS)        PG_RETURN_NULL();    }
-#ifdef WIN32
-    /*
-     * Win32 doesn't throw a catchable exception for SELECT -2147483648 /
-     * (-1); -- INT_MIN
+     * Overflow check.    The only possible overflow case is for arg1 =
+     * INT32_MIN, arg2 = -1, where the correct result is -INT32_MIN, which
+     * can't be represented on a two's-complement machine.     */
-    if (arg2 == -1 && arg1 == INT_MIN)
+    if (arg1 == INT32_MIN && arg2 == -1)        ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),                errmsg("integer out of range")));
 
-#endif    result = arg1 / arg2;
-    /*
-     * Overflow check.    The only possible overflow case is for arg1 = INT_MIN,
-     * arg2 = -1, where the correct result is -INT_MIN, which can't be
-     * represented on a two's-complement machine.  Most machines produce
-     * INT_MIN but it seems some produce zero.
-     */
-    if (arg2 == -1 && arg1 < 0 && result <= 0)
-        ereport(ERROR,
-                (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-                 errmsg("integer out of range")));    PG_RETURN_INT32(result);}
@@ -877,18 +865,17 @@ int2div(PG_FUNCTION_ARGS)        PG_RETURN_NULL();    }
-    result = arg1 / arg2;
-
-    /*
-     * Overflow check.    The only possible overflow case is for arg1 =
-     * SHRT_MIN, arg2 = -1, where the correct result is -SHRT_MIN, which can't
-     * be represented on a two's-complement machine.  Most machines produce
-     * SHRT_MIN but it seems some produce zero.
+    /* Overflow check.    The only possible overflow case is for arg1 =
+     * INT16_MIN, arg2 = -1, where the correct result is -INT16_MIN, which
+     * can't be represented on a two's-complement machine.     */
-    if (arg2 == -1 && arg1 < 0 && result <= 0)
+    if (arg1 == INT16_MIN && arg2 == -1)        ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),                errmsg("smallint out of range")));
 
+
+    result = arg1 / arg2;
+    PG_RETURN_INT16(result);}
@@ -1065,18 +1052,18 @@ int42div(PG_FUNCTION_ARGS)        PG_RETURN_NULL();    }
-    result = arg1 / arg2;
-    /*
-     * Overflow check.    The only possible overflow case is for arg1 = INT_MIN,
-     * arg2 = -1, where the correct result is -INT_MIN, which can't be
-     * represented on a two's-complement machine.  Most machines produce
-     * INT_MIN but it seems some produce zero.
+     * Overflow check.    The only possible overflow case is for arg1 =
+     * INT32_MIN, arg2 = -1, where the correct result is -INT32_MIN, which
+     * can't be represented on a two's-complement machine.     */
-    if (arg2 == -1 && arg1 < 0 && result <= 0)
+    if (arg1 == INT32_MIN && arg2 == -1)        ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),                errmsg("integer out of range")));
 
+
+    result = arg1 / arg2;
+    PG_RETURN_INT32(result);}
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 59c110b..83531ad 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -598,18 +598,18 @@ int8div(PG_FUNCTION_ARGS)        PG_RETURN_NULL();    }
-    result = arg1 / arg2;
-    /*     * Overflow check.    The only possible overflow case is for arg1 =     * INT64_MIN, arg2 = -1, where the
correctresult is -INT64_MIN, which
 
-     * can't be represented on a two's-complement machine.    Most machines
-     * produce INT64_MIN but it seems some produce zero.
+     * can't be represented on a two's-complement machine.     */
-    if (arg2 == -1 && arg1 < 0 && result <= 0)
+    if (arg1 == INT64_MIN && arg2 == -1)        ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),                errmsg("bigint out of range")));
 
+
+    result = arg1 / arg2;
+    PG_RETURN_INT64(result);}
@@ -838,18 +838,18 @@ int84div(PG_FUNCTION_ARGS)        PG_RETURN_NULL();    }
-    result = arg1 / arg2;
-    /*     * Overflow check.    The only possible overflow case is for arg1 =     * INT64_MIN, arg2 = -1, where the
correctresult is -INT64_MIN, which
 
-     * can't be represented on a two's-complement machine.    Most machines
-     * produce INT64_MIN but it seems some produce zero.
+     * can't be represented on a two's-complement machine.     */
-    if (arg2 == -1 && arg1 < 0 && result <= 0)
+    if (arg1 == INT64_MIN && arg2 == -1)        ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),                errmsg("bigint out of range")));
 
+
+    result = arg1 / arg2;
+    PG_RETURN_INT64(result);}
@@ -1026,18 +1026,18 @@ int82div(PG_FUNCTION_ARGS)        PG_RETURN_NULL();    }
-    result = arg1 / arg2;
-    /*     * Overflow check.    The only possible overflow case is for arg1 =     * INT64_MIN, arg2 = -1, where the
correctresult is -INT64_MIN, which
 
-     * can't be represented on a two's-complement machine.    Most machines
-     * produce INT64_MIN but it seems some produce zero.
+     * can't be represented on a two's-complement machine.     */
-    if (arg2 == -1 && arg1 < 0 && result <= 0)
+    if (arg1 == INT64_MIN && arg2 == -1)        ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),                errmsg("bigint out of range")));
 
+
+    result = arg1 / arg2;
+    PG_RETURN_INT64(result);}
diff --git a/src/include/c.h b/src/include/c.h
index a6c0e6e..547a188 100644
--- a/src/include/c.h
+++ b/src/include/c.h
@@ -294,6 +294,29 @@ typedef unsigned long long int uint64;#define UINT64CONST(x) ((uint64) x)#endif
+#ifndef INT16_MAX
+#define INT16_MAX 32767
+#endif
+
+#ifndef INT16_MIN
+#define INT16_MIN (-INT16_MAX-1)
+#endif
+
+#ifndef INT32_MAX
+#define INT32_MAX 2147483647
+#endif
+
+#ifndef INT32_MIN
+#define INT32_MIN (-INT32_MAX-1)
+#endif
+
+#ifndef INT64_MAX
+#define INT64_MAX INT64CONST(9223372036854775807)
+#endif
+
+#ifndef INT64_MIN
+#define INT64_MIN (-INT64_MAX-1)
+#endif/* Select timestamp representation (float8 or int64) */#ifdef USE_INTEGER_DATETIMES




pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: autovacuum stress-testing our system
Next
From: Tom Lane
Date:
Subject: Re: [RFC] Fix div/mul crash and more undefined behavior