Thread: Integer parsing bug?
Section 8.1 of the manual gives the range of an integer as -2147483648 to +2147483647. template1=# select '-2147483648'::int; int4 ------------- -2147483648 (1 row) template1=# select -2147483648::int; ERROR: integer out of range Oops. template1=# select version(); version ------------------------------------------------------------- PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC 2.96 (1 row) Completely vanilla build - no options other than --prefix to configure. Clean installation, this is immediately after an initdb. I see the same bug on Solaris, built with Forte C in 64 bit mode. Cheers, Steve
Steve Atkins wrote: > Section 8.1 of the manual gives the range of an integer > as -2147483648 to +2147483647. > > > template1=# select '-2147483648'::int; > int4 > ------------- > -2147483648 > (1 row) > > template1=# select -2147483648::int; > ERROR: integer out of range > > Oops. > > template1=# select version(); > version > ------------------------------------------------------------- > PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC 2.96 > (1 row) > > Completely vanilla build - no options other than --prefix to > configure. Clean installation, this is immediately after an initdb. > > I see the same bug on Solaris, built with Forte C in 64 bit mode. Yep, it definately looks weird: test=> select '-2147483648'::int; int4 ------------- -2147483648 (1 row) test=> select -2147483648::int; ERROR: integer out of range test=> select -2147483647::int; ?column? ------------- -2147483647 (1 row) test=> select '-2147483649'::int; ERROR: value "-2147483649" is out of range for type integer The non-quoting works only for *47, and the quoting works for *48, but both fail for *49. I looked at libc's strtol(), and that works fine, as does our existing parser checks. The error is coming from int84, a comparison function called from the executor. Here is a test program: #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { long long l = -2147483648; int i = l; if (i != l) printf("not equal\n"); else printf("equal\n"); return 0; } A compile generates the following warning: tst1.c:6: warning: decimal constant is so large that it is unsigned and reports "not equal". I see in the freebsd machine/limits.h file: * According to ANSI (section 2.2.4.2), the values below must be usable by * #if preprocessing directives. Additionally, the expression must have the * same type as would an expression that is an object of the corresponding * type converted according to the integral promotions. The subtraction for * INT_MIN, etc., is so the value is not unsigned; e.g., 0x80000000 is an * unsigned int for 32-bit two's complement ANSI compilers (section 3.1.3.2). * These numbers are for the default configuration of gcc. They work for * some other compilers as well, but this should not be depended on. #define INT_MAX 0x7fffffff /* max value for an int */ #define INT_MIN (-0x7fffffff - 1) /* min value for an int */ Basically, what is happening is that the special value -INT_MAX-1 is being converted to an int value, and the compiler is casting it to an unsigned. Seems this is a known C issue and I can't see a good fix for it except perhaps check for INT_MIN int he int84 function, but I ran some tests and that didn't work either. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, Mar 03, 2004 at 12:31:47PM -0500, Bruce Momjian wrote: > Yep, it definately looks weird: > > test=> select '-2147483648'::int; > int4 > ------------- > -2147483648 > (1 row) > > test=> select -2147483648::int; > ERROR: integer out of range > test=> select -2147483647::int; > ?column? > ------------- > -2147483647 > (1 row) > > test=> select '-2147483649'::int; > ERROR: value "-2147483649" is out of range for type integer > > The non-quoting works only for *47, and the quoting works for *48, but > both fail for *49. > > I looked at libc's strtol(), and that works fine, as does our existing > parser checks. The error is coming from int84, a comparison function > called from the executor. Here is a test program: I traced through that far and managed to convince myself that the problem was that it was considering a -...48 to be an int8, rather than an int4, so was hitting int84() when it shouldn't have been - and the input values for int84() looked very, very broken. Specifically, a breakpoint on int84() fires on -..48 and -..49, but not on -..47, suggesting that the problem is somewhere in the parsing before it reaches int84(). I'm happy to take a look at it, but got very lost in the maze of twisty parse routines, all alike, when I tried to track back further. Is there any overview documentation on that end of the code? > I see in the freebsd machine/limits.h file: > > * According to ANSI (section 2.2.4.2), the values below must be usable by > * #if preprocessing directives. Additionally, the expression must have the > * same type as would an expression that is an object of the corresponding > * type converted according to the integral promotions. The subtraction for > * INT_MIN, etc., is so the value is not unsigned; e.g., 0x80000000 is an > * unsigned int for 32-bit two's complement ANSI compilers (section 3.1.3.2). > * These numbers are for the default configuration of gcc. They work for > * some other compilers as well, but this should not be depended on. > > #define INT_MAX 0x7fffffff /* max value for an int */ > #define INT_MIN (-0x7fffffff - 1) /* min value for an int */ > > Basically, what is happening is that the special value -INT_MAX-1 is > being converted to an int value, and the compiler is casting it to an > unsigned. Seems this is a known C issue and I can't see a good fix for > it except perhaps check for INT_MIN int he int84 function, but I ran > some tests and that didn't work either. I don't read it that way. INT_MIN is correctly read as a signed int, but it can't be defined as -0x8000000 as that would be parsed as -(0x80000000) and the constant 0x80000000 is unsigned. Cheers, Steve
Steve Atkins <steve@blighty.com> writes: >> test=> select -2147483648::int; >> ERROR: integer out of range There is no bug here. You are mistakenly assuming that the above represents select (-2147483648)::int; But actually the :: operator binds more tightly than unary minus, so Postgres reads it as select -(2147483648::int); and quite rightly fails to convert the int8 literal to int. If you write it with the correct parenthesization it works: regression=# select -2147483648::int; ERROR: integer out of range regression=# select (-2147483648)::int; int4 ------------- -2147483648 (1 row) regards, tom lane
On Wed, Mar 03, 2004 at 06:27:07PM -0500, Tom Lane wrote: > Steve Atkins <steve@blighty.com> writes: > >> test=> select -2147483648::int; > >> ERROR: integer out of range > > There is no bug here. You are mistakenly assuming that the above > represents > select (-2147483648)::int; > But actually the :: operator binds more tightly than unary minus, > so Postgres reads it as > select -(2147483648::int); > and quite rightly fails to convert the int8 literal to int. > > If you write it with the correct parenthesization it works: > > regression=# select -2147483648::int; > ERROR: integer out of range > regression=# select (-2147483648)::int; OK... That makes sense if the parser has no support for negative constants, but it doesn't seem like intuitive behaviour. BTW, the original issue that led to this was: db=>CREATE function t(integer) RETURNS integer AS ' BEGIN return 0; END; ' LANGUAGE 'plpgsql'; db=> select t(-2147483648); ERROR: function t(bigint) does not exist Which again makes sense considering the way the parser works, but still seems to violate the principle of least surprise. Cheers, Steve