Thread: Strange interval arithmetic
What's going on here? Some sort of integer wraparound? WORKS ===== mysql=# select interval '2378 seconds'; interval ---------- 00:39:38 (1 row) mysql=# mysql=# select 2378 * interval '1 second'; ?column? ---------- 00:39:38 (1 row) DOESN'T WORK ============ test=# select interval '2378234234 seconds'; interval -------------- 596523:14:07 (1 row) test=# select 2378234234 * interval '1 second'; ?column? -------------- 660620:37:14 (1 row)
On Sun, Nov 27, 2005 at 11:15:04PM +0800, Christopher Kings-Lynne wrote: > What's going on here? Some sort of integer wraparound? [...] > test=# select interval '2378234234 seconds'; > interval > -------------- > 596523:14:07 > (1 row) Looks like the value is stuck at 2^31 - 1 seconds: test=> select interval '2147483646 seconds'; -- 2^31 - 2 interval --------------596523:14:06 (1 row) test=> select interval '2147483647 seconds'; -- 2^31 - 1 interval --------------596523:14:07 (1 row) test=> select interval '2147483648 seconds'; -- 2^31 interval --------------596523:14:07 (1 row) -- Michael Fuhr
On Sun, Nov 27, 2005 at 08:45:18AM -0700, Michael Fuhr wrote: > Looks like the value is stuck at 2^31 - 1 seconds: I see this behavior back to at least 7.3. I'd guess it's because strtol() indicates overflow by returning LONG_MAX and setting errno to ERANGE, but the code doesn't check for that. -- Michael Fuhr
On Sun, Nov 27, 2005 at 11:27:54AM -0700, Michael Fuhr wrote: > On Sun, Nov 27, 2005 at 08:45:18AM -0700, Michael Fuhr wrote: > > Looks like the value is stuck at 2^31 - 1 seconds: > > I see this behavior back to at least 7.3. I'd guess it's because > strtol() indicates overflow by returning LONG_MAX and setting errno > to ERANGE, but the code doesn't check for that. Is this worth looking at for the upcoming dot releases? It's apparently a longstanding behavior that almost nobody encounters, yet knowingly not addressing it seems a bit MySQLish ;-) Here's the start of the thread for anybody who missed it: http://archives.postgresql.org/pgsql-hackers/2005-11/msg01385.php -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: >> I see this behavior back to at least 7.3. I'd guess it's because >> strtol() indicates overflow by returning LONG_MAX and setting errno >> to ERANGE, but the code doesn't check for that. > Is this worth looking at for the upcoming dot releases? Sure, send a patch ... regards, tom lane
On Wed, Nov 30, 2005 at 12:37:40PM -0500, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > >> I see this behavior back to at least 7.3. I'd guess it's because > >> strtol() indicates overflow by returning LONG_MAX and setting errno > >> to ERANGE, but the code doesn't check for that. > > > Is this worth looking at for the upcoming dot releases? > > Sure, send a patch ... Any preferences on an approach? The simplest and easiest to verify would be to raise an error for just this particular case; a TODO item might be to change how the string is parsed to allow values larger than LONG_MAX. I see several calls to strtol() that aren't checked for overflow but that might not be relevant to this problem, so I'm thinking this patch ought not touch them. Maybe that's another TODO item. -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > On Wed, Nov 30, 2005 at 12:37:40PM -0500, Tom Lane wrote: >> Sure, send a patch ... > Any preferences on an approach? The simplest and easiest to verify > would be to raise an error for just this particular case; a TODO > item might be to change how the string is parsed to allow values > larger than LONG_MAX. I think the latter would be a feature enhancement and therefore not good material to back-patch. Just erroring out seems appropriate for now. > I see several calls to strtol() that aren't checked for overflow but > that might not be relevant to this problem, so I'm thinking this patch > ought not touch them. Maybe that's another TODO item. If it's possible for them to be given overflowing input, they probably ought to be checked. regards, tom lane
On Wed, Nov 30, 2005 at 02:01:46PM -0500, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > Any preferences on an approach? The simplest and easiest to verify > > would be to raise an error for just this particular case; a TODO > > item might be to change how the string is parsed to allow values > > larger than LONG_MAX. > > I think the latter would be a feature enhancement and therefore not > good material to back-patch. Just erroring out seems appropriate > for now. Agreed. I'm thinking about rewriting strtol() calls in datetime.c to look like this: errno = 0; val = strtol(field[i], &cp, 10); if (errno == ERANGE) return DTERR_FIELD_OVERFLOW; Does that look okay? Or would you rather raise an error with ereport()? > > I see several calls to strtol() that aren't checked for overflow but > > that might not be relevant to this problem, so I'm thinking this patch > > ought not touch them. Maybe that's another TODO item. > > If it's possible for them to be given overflowing input, they probably > ought to be checked. I'm looking at all the strtol() calls in datetime.c right now; I haven't looked anywhere else yet. Should I bother checking values that will be range checked later anyway? Time zone displacements, for example? -- Michael Fuhr
Michael Fuhr wrote: > Agreed. I'm thinking about rewriting strtol() calls in datetime.c > to look like this: > > errno = 0; > val = strtol(field[i], &cp, 10); > if (errno == ERANGE) > return DTERR_FIELD_OVERFLOW; Hmm, why not check both the return value _and_ errno: val = strtol(field[i], &cp, 10); if (val == LONG_MAX && errno == ERANGE)return DTERR_FIELD_OVERFLOW; -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Wed, Nov 30, 2005 at 07:06:42PM -0300, Alvaro Herrera wrote: > Hmm, why not check both the return value _and_ errno: > > val = strtol(field[i], &cp, 10); > if (val == LONG_MAX && errno == ERANGE) > return DTERR_FIELD_OVERFLOW; I usually check both in my own code but I noticed several places where PostgreSQL doesn't, so I kept that style. I'll check both if that's preferred. -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > errno = 0; > val = strtol(field[i], &cp, 10); > if (errno == ERANGE) > return DTERR_FIELD_OVERFLOW; > Does that look okay? Or would you rather raise an error with ereport()? Looks fine to me, at least in the routines that are for datetime stuff. > I'm looking at all the strtol() calls in datetime.c right now; I > haven't looked anywhere else yet. Should I bother checking values > that will be range checked later anyway? Time zone displacements, > for example? Good question. Is strtol guaranteed to return INT_MAX or INT_MIN on overflow, or might it return the overflowed value? regards, tom lane
Michael Fuhr <mike@fuhr.org> writes: > I usually check both in my own code but I noticed several places > where PostgreSQL doesn't, so I kept that style. I'll check both > if that's preferred. I'd say not --- it's more code and it makes a possibly unwarranted assumption about strtol's behavior. regards, tom lane
On Wed, Nov 30, 2005 at 05:20:54PM -0500, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > I'm looking at all the strtol() calls in datetime.c right now; I > > haven't looked anywhere else yet. Should I bother checking values > > that will be range checked later anyway? Time zone displacements, > > for example? > > Good question. Is strtol guaranteed to return INT_MAX or INT_MIN on > overflow, or might it return the overflowed value? The Open Group Base Specifications say this: Upon successful completion, these functions shall return the converted value, if any. If no conversion could be performed,0 shall be returned and errno may be set to [EINVAL]. If the correct value is outside the range of representable values, {LONG_MIN}, {LONG_MAX}, {LLONG_MIN}, or {LLONG_MAX} shallbe returned (according to the sign of the value), and errno set to [ERANGE]. http://www.opengroup.org/onlinepubs/009695399/functions/strtol.html FreeBSD and Solaris both peg overflow at LONG_MAX, and that behavior is what I noticed in the first place. I don't know if any systems behave otherwise. Alvaro suggested checking for both LONG_MAX and ERANGE; I suppose if we check for LONG_MAX then we should also check for LONG_MIN. I don't know if any systems might set ERANGE in a non-error situation. -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > I suppose if we check for LONG_MAX then we should also check > for LONG_MIN. s/should/must/, which makes the code even more complicated, in order to buy what exactly? > I don't know if any systems might set ERANGE in a non-error situation. The SUS saith http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html The strtol() function will not change the setting of errno ifsuccessful. Perhaps more to the point, we've been doing it that way (errno test only) for many years without complaints. Adding a test on the return value is venturing into less charted waters. regards, tom lane
On Wed, Nov 30, 2005 at 05:23:23PM -0500, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > I usually check both in my own code but I noticed several places > > where PostgreSQL doesn't, so I kept that style. I'll check both > > if that's preferred. > > I'd say not --- it's more code and it makes a possibly unwarranted > assumption about strtol's behavior. OTOH, it might be an unwarranted assumption that ERANGE alone indicates error. It's possible that on some system errno's value is meaningless unless strtol() returns one of the documented indicators (LONG_MAX or LONG_MIN). I've seen system calls that behave that way: errno might get set in a non-error situation due to the underlying implementation (e.g., wrappers around socket functions in userland thread implementations), but the programmer has no business looking at errno unless the function returns -1. -- Michael Fuhr
On Wed, Nov 30, 2005 at 05:49:53PM -0500, Tom Lane wrote: > The SUS saith > http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html > > The strtol() function will not change the setting of errno if > successful. > > Perhaps more to the point, we've been doing it that way (errno test > only) for many years without complaints. Adding a test on the return > value is venturing into less charted waters. Good, I'll stick with just the ERANGE check then. -- Michael Fuhr
Tom Lane wrote: >Michael Fuhr <mike@fuhr.org> writes: > > >>I suppose if we check for LONG_MAX then we should also check >>for LONG_MIN. >> >> > >s/should/must/, which makes the code even more complicated, in order to >buy what exactly? > > > >>I don't know if any systems might set ERANGE in a non-error situation. >> >> > >The SUS saith >http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html > > The strtol() function will not change the setting of errno if > successful. > >Perhaps more to the point, we've been doing it that way (errno test >only) for many years without complaints. Adding a test on the return >value is venturing into less charted waters. > > > > LONG_MIN/LONG_MAX might be the actual values provided, too, mightn't they? checking for ERANGE seems like the only viable test. cheers andrew
On Wed, Nov 30, 2005 at 06:00:07PM -0500, Andrew Dunstan wrote: > LONG_MIN/LONG_MAX might be the actual values provided, too, mightn't > they? checking for ERANGE seems like the only viable test. Errno needs to be checked in any case for just that reason; the question was whether checking *only* errno is sufficient to detect an error. According to the standard it is. -- Michael Fuhr
Hmmm...is this something else that needs fixing? The doc says dates range from 4713 BC to 32767 AD. test=> select '11754179-08-04'::date; date ----------------11754179-08-04 (1 row) test=> select '11754179-08-05'::date; date ---------------4801-01-01 BC (1 row) -- Michael Fuhr
Tom Lane <tgl@sss.pgh.pa.us> writes: > Michael Fuhr <mike@fuhr.org> writes: > > I usually check both in my own code but I noticed several places > > where PostgreSQL doesn't, so I kept that style. I'll check both > > if that's preferred. > > I'd say not --- it's more code and it makes a possibly unwarranted > assumption about strtol's behavior. > Generally speaking looking at errno when you haven't received an error return from a libc function is asking for trouble. It could be leftover from any previous libc error. That's how you get programs saying things like "strtol: No such file or directory" ... The strtol() function returns the result of the conversion, unless the value would underflow or overflow. If an underflowoccurs, strtol() returns LONG_MIN. If an overflow occurs, strtol() returns LONG_MAX. In both cases, errno isset to ERANGE. Precisely the same holds for strtoll() (with LLONG_MIN and LLONG_MAX instead of LONG_MIN and LONG_MAX). -- greg
> Generally speaking looking at errno when you haven't received an error return > from a libc function is asking for trouble. It could be leftover from any > previous libc error. Oh, sorry, just reread the code snippet and see that isn't an issue. nevermind. -- greg
Greg Stark <gsstark@MIT.EDU> writes: > Generally speaking looking at errno when you haven't received an error return > from a libc function is asking for trouble. It could be leftover from any > previous libc error. > > That's how you get programs saying things like "strtol: No such file or > directory" ... Ah, I take back my taking back of this. It's still dicey to be doing it this way -- even if you reset errno before calling the library function. The problem is that the function can call other libc functions, which may in turn return errors. But these errors might be incidental and handled by the function you're calling. The typical case of this is calling printf which first calls isatty(). printf then returns success but leaves errno set to ENOTTY. And programs that check errno without checking the return valule -- even if they clear it before calling printf -- mysteriously print "Not a Typewriter" after correctly printing the data. The SUS standard protects against this though by only allowing this to happen for functions that don't themselves use errno to signal errors: The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use oferrno is not documented in the description of the function in this International Standard. Older platforms may still have this behaviour, but strtol seems like a pretty innocuous case. It's hard to imagine strtol needing to call much else. And strtol was an ANSI addition so one imagines most platforms got it right from the beginning. -- greg
Greg Stark <gsstark@mit.edu> writes: > Ah, I take back my taking back of this. It's still dicey to be doing it this > way -- even if you reset errno before calling the library function. See the rest of the thread. The I-believe-what-I-read-in-the-spec camp may take comfort from what it says in the SUS strtol spec, http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html The strtol() function will not change the setting of errno ifsuccessful. Those who are disinclined to assume that implementations always follow the spec may take comfort from the fact that we've been testing errno only (without also inspecting the return value) in pg_atoi for years without problems. From neither point of view do I see an argument for adding a test on the return value. It's unnecessary according to the spec, and if you disbelieve that strtol follows the spec, you should also be suspicious about whether it's guaranteed to return LONG_MAX/LONG_MIN upon overflow failure. We have no track record that gives evidence that the latter is universally true. regards, tom lane
On Thu, Dec 01, 2005 at 03:31:41PM -0500, Greg Stark wrote: > Greg Stark <gsstark@MIT.EDU> writes: > > Generally speaking looking at errno when you haven't received an error return > > from a libc function is asking for trouble. It could be leftover from any > > previous libc error. > > > > That's how you get programs saying things like "strtol: No such file or > > directory" ... > > Ah, I take back my taking back of this. It's still dicey to be doing it this > way -- even if you reset errno before calling the library function. > > The problem is that the function can call other libc functions, which may in > turn return errors. But these errors might be incidental and handled by the > function you're calling. I had that concern, as I've seen such incidental errno changes before. But Tom pointed out the following from SUS: The strtol() function shall not change the setting of errno if successful. Immediately after that the standard says: Since 0, {LONG_MIN} or {LLONG_MIN}, and {LONG_MAX} or {LLONG_MAX} are returned on error and are also valid returns on success,an application wishing to check for error situations should set errno to 0, then call strtol() or strtoll(), thencheck errno. I don't know if any systems are non-compliant in this respect, but Tom said that "we've been doing it that way (errno test only) for many years without complaints." -- Michael Fuhr