Thread: A few cases of left shifting negative integers
Hello, during my testing I've found cases of left-shifting negative integers during run-time and I was recently encouraged to post a report of them, so here it is (done against 960ea971e66bcd621ba88841b4cb85c7f0e7c383). Back-traces are included in the attached file. If any kind of information seems wrong or missing, just let me know and I'll try to fix it. (In hindsight, I should have also printed the objects' values). src/backend/utils/adt/int8.c:1262 Datum int8shl(PG_FUNCTION_ARGS) { int64 arg1 = PG_GETARG_INT64(0); int32 arg2 = PG_GETARG_INT32(1); PG_RETURN_INT64(arg1 << arg2); } src/backend/utils/adt/int.c:1258 Datum int4shl(PG_FUNCTION_ARGS) { int32 arg1 = PG_GETARG_INT32(0); int32 arg2 = PG_GETARG_INT32(1); PG_RETURN_INT32(arg1 << arg2); } src/backend/utils/adt/int.c:1320 Datum int2shl(PG_FUNCTION_ARGS) { int16 arg1 = PG_GETARG_INT16(0); int32 arg2 = PG_GETARG_INT32(1); PG_RETURN_INT16(arg1 << arg2); } src/backend/utils/adt/network.c:1523 /* * If input is narrower than int64, overflow is not possible, but we * have to do proper sign extension. */ if (carry == 0 && byte < sizeof(int64)) res |= ((int64) -1) << (byte * 8); src/backend/utils/cache/inval.c:587 else if (msg->id == SHAREDINVALSMGR_ID) { /* * We could have smgr entries for relations of other databases, so no * short-circuit test is possible here. */ RelFileNodeBackend rnode; rnode.node = msg->sm.rnode; rnode.backend = (msg->sm.backend_hi << 16) | (int) msg->sm.backend_lo; smgrclosenode(rnode); } src/timezone/localtime.c:127 static long detzcode(const char *codep) { long result; int i; result = (codep[0] & 0x80) ? ~0L : 0; for (i = 0; i < 4; ++i) result = (result << 8) | (codep[i] & 0xff); return result; } src/common/pg_lzcompress.c:404 src/common/pg_lzcompress.c:637 src/common/pg_lzcompress.c:651 These I can't conveniently reproduce and present to you, because a couple of macros are involved.
Attachment
Piotr Stefaniak <postgres@piotr-stefaniak.me> writes: > during my testing I've found cases of left-shifting negative integers > during run-time and I was recently encouraged to post a report of them, > so here it is (done against 960ea971e66bcd621ba88841b4cb85c7f0e7c383). What's your concern exactly? The behavior is well-defined, at least as long as we don't shift far enough to have integer overflow, and I believe in most of these cases that's not possible. In int8shl etc, you just get whatever the machine does with that, which seems fine to me; we do not document any particular semantics for such cases. regards, tom lane
On 2015-08-21 13:03:42 -0400, Tom Lane wrote: > The behavior is well-defined, at least as long as we don't shift far > enough to have integer overflow Unfortunately not: 5.8.2: The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2 E2 , reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1 × 2 E2 is representable in the result type, then that is the resulting value; >>otherwise, the behavior is undefined<<. See the >><< highlighted bit...
Andres Freund <andres@anarazel.de> writes: > On 2015-08-21 13:03:42 -0400, Tom Lane wrote: >> The behavior is well-defined, at least as long as we don't shift far >> enough to have integer overflow > Unfortunately not: > 5.8.2: The value of E1 << E2 is E1 left-shifted E2 bit positions; > vacated bits are zero-filled. If E1 has an unsigned type, the value of > the result is E1 � 2 E2 , reduced modulo one more than the maximum value > representable in the result type. Otherwise, if E1 has a signed type and > non-negative value, and E1 � 2 E2 is representable in the result type, > then that is the resulting value; >>otherwise, the behavior is undefined<<. [ rolls eyes... ] There isn't any machine in the world where the behavior isn't well-defined short of overflow. Why do the C spec authors persist in pretending otherwise? regards, tom lane
On 2015-08-21 13:27:22 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2015-08-21 13:03:42 -0400, Tom Lane wrote: > >> The behavior is well-defined, at least as long as we don't shift far > >> enough to have integer overflow > > > Unfortunately not: > > 5.8.2: The value of E1 << E2 is E1 left-shifted E2 bit positions; > > vacated bits are zero-filled. If E1 has an unsigned type, the value of > > the result is E1 � 2 E2 , reduced modulo one more than the maximum value > > representable in the result type. Otherwise, if E1 has a signed type and > > non-negative value, and E1 � 2 E2 is representable in the result type, > > then that is the resulting value; >>otherwise, the behavior is undefined<<. > > [ rolls eyes... ] There isn't any machine in the world where the behavior > isn't well-defined short of overflow. > Why do the C spec authors persist in pretending otherwise? Yea, it's way past time that C is redefined being based on 2-s completement. And why this is declared undefined rather than implementation defined is completely beyond me. FWIW, icc apparently has been observed to recognize that a negative value cannot be shifted and thus optimized based on the assumption that the number is positive...