Thread: Strange interval arithmetic

Strange interval arithmetic

From
Christopher Kings-Lynne
Date:
What's going on here?  Some sort of integer wraparound?

WORKS
=====

mysql=# select interval '2378 seconds'; interval
---------- 00:39:38
(1 row)

mysql=#
mysql=# select 2378 * interval '1 second'; ?column?
---------- 00:39:38
(1 row)

DOESN'T WORK
============

test=# select interval '2378234234 seconds';   interval
-------------- 596523:14:07
(1 row)

test=# select 2378234234 * interval '1 second';   ?column?
-------------- 660620:37:14
(1 row)



Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Sun, Nov 27, 2005 at 11:15:04PM +0800, Christopher Kings-Lynne wrote:
> What's going on here?  Some sort of integer wraparound?
[...]
> test=# select interval '2378234234 seconds';
>    interval
> --------------
>  596523:14:07
> (1 row)

Looks like the value is stuck at 2^31 - 1 seconds:

test=> select interval '2147483646 seconds';  -- 2^31 - 2  interval   
--------------596523:14:06
(1 row)

test=> select interval '2147483647 seconds';  -- 2^31 - 1  interval   
--------------596523:14:07
(1 row)

test=> select interval '2147483648 seconds';  -- 2^31  interval   
--------------596523:14:07
(1 row)

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Sun, Nov 27, 2005 at 08:45:18AM -0700, Michael Fuhr wrote:
> Looks like the value is stuck at 2^31 - 1 seconds:

I see this behavior back to at least 7.3.  I'd guess it's because
strtol() indicates overflow by returning LONG_MAX and setting errno
to ERANGE, but the code doesn't check for that.

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Sun, Nov 27, 2005 at 11:27:54AM -0700, Michael Fuhr wrote:
> On Sun, Nov 27, 2005 at 08:45:18AM -0700, Michael Fuhr wrote:
> > Looks like the value is stuck at 2^31 - 1 seconds:
> 
> I see this behavior back to at least 7.3.  I'd guess it's because
> strtol() indicates overflow by returning LONG_MAX and setting errno
> to ERANGE, but the code doesn't check for that.

Is this worth looking at for the upcoming dot releases?  It's
apparently a longstanding behavior that almost nobody encounters,
yet knowingly not addressing it seems a bit MySQLish ;-)  Here's
the start of the thread for anybody who missed it:

http://archives.postgresql.org/pgsql-hackers/2005-11/msg01385.php

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
>> I see this behavior back to at least 7.3.  I'd guess it's because
>> strtol() indicates overflow by returning LONG_MAX and setting errno
>> to ERANGE, but the code doesn't check for that.

> Is this worth looking at for the upcoming dot releases?

Sure, send a patch ...
        regards, tom lane


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Wed, Nov 30, 2005 at 12:37:40PM -0500, Tom Lane wrote:
> Michael Fuhr <mike@fuhr.org> writes:
> >> I see this behavior back to at least 7.3.  I'd guess it's because
> >> strtol() indicates overflow by returning LONG_MAX and setting errno
> >> to ERANGE, but the code doesn't check for that.
> 
> > Is this worth looking at for the upcoming dot releases?
> 
> Sure, send a patch ...

Any preferences on an approach?  The simplest and easiest to verify
would be to raise an error for just this particular case; a TODO
item might be to change how the string is parsed to allow values
larger than LONG_MAX.  I see several calls to strtol() that aren't
checked for overflow but that might not be relevant to this problem,
so I'm thinking this patch ought not touch them.  Maybe that's another
TODO item.

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> On Wed, Nov 30, 2005 at 12:37:40PM -0500, Tom Lane wrote:
>> Sure, send a patch ...

> Any preferences on an approach?  The simplest and easiest to verify
> would be to raise an error for just this particular case; a TODO
> item might be to change how the string is parsed to allow values
> larger than LONG_MAX.

I think the latter would be a feature enhancement and therefore not
good material to back-patch.  Just erroring out seems appropriate
for now.

> I see several calls to strtol() that aren't checked for overflow but
> that might not be relevant to this problem, so I'm thinking this patch
> ought not touch them.  Maybe that's another TODO item.

If it's possible for them to be given overflowing input, they probably
ought to be checked.
        regards, tom lane


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Wed, Nov 30, 2005 at 02:01:46PM -0500, Tom Lane wrote:
> Michael Fuhr <mike@fuhr.org> writes:
> > Any preferences on an approach?  The simplest and easiest to verify
> > would be to raise an error for just this particular case; a TODO
> > item might be to change how the string is parsed to allow values
> > larger than LONG_MAX.
> 
> I think the latter would be a feature enhancement and therefore not
> good material to back-patch.  Just erroring out seems appropriate
> for now.

Agreed.  I'm thinking about rewriting strtol() calls in datetime.c
to look like this:
 errno = 0; val = strtol(field[i], &cp, 10); if (errno == ERANGE)     return DTERR_FIELD_OVERFLOW;

Does that look okay?  Or would you rather raise an error with ereport()?

> > I see several calls to strtol() that aren't checked for overflow but
> > that might not be relevant to this problem, so I'm thinking this patch
> > ought not touch them.  Maybe that's another TODO item.
> 
> If it's possible for them to be given overflowing input, they probably
> ought to be checked.

I'm looking at all the strtol() calls in datetime.c right now; I
haven't looked anywhere else yet.  Should I bother checking values
that will be range checked later anyway?  Time zone displacements,
for example?

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Alvaro Herrera
Date:
Michael Fuhr wrote:

> Agreed.  I'm thinking about rewriting strtol() calls in datetime.c
> to look like this:
> 
>   errno = 0;
>   val = strtol(field[i], &cp, 10);
>   if (errno == ERANGE)
>       return DTERR_FIELD_OVERFLOW;

Hmm, why not check both the return value _and_ errno:

val = strtol(field[i], &cp, 10);
if (val == LONG_MAX && errno == ERANGE)return DTERR_FIELD_OVERFLOW;

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Wed, Nov 30, 2005 at 07:06:42PM -0300, Alvaro Herrera wrote:
> Hmm, why not check both the return value _and_ errno:
> 
> val = strtol(field[i], &cp, 10);
> if (val == LONG_MAX && errno == ERANGE)
>     return DTERR_FIELD_OVERFLOW;

I usually check both in my own code but I noticed several places
where PostgreSQL doesn't, so I kept that style.  I'll check both
if that's preferred.

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
>   errno = 0;
>   val = strtol(field[i], &cp, 10);
>   if (errno == ERANGE)
>       return DTERR_FIELD_OVERFLOW;

> Does that look okay?  Or would you rather raise an error with ereport()?

Looks fine to me, at least in the routines that are for datetime stuff.

> I'm looking at all the strtol() calls in datetime.c right now; I
> haven't looked anywhere else yet.  Should I bother checking values
> that will be range checked later anyway?  Time zone displacements,
> for example?

Good question.  Is strtol guaranteed to return INT_MAX or INT_MIN on
overflow, or might it return the overflowed value?
        regards, tom lane


Re: Strange interval arithmetic

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> I usually check both in my own code but I noticed several places
> where PostgreSQL doesn't, so I kept that style.  I'll check both
> if that's preferred.

I'd say not --- it's more code and it makes a possibly unwarranted
assumption about strtol's behavior.
        regards, tom lane


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Wed, Nov 30, 2005 at 05:20:54PM -0500, Tom Lane wrote:
> Michael Fuhr <mike@fuhr.org> writes:
> > I'm looking at all the strtol() calls in datetime.c right now; I
> > haven't looked anywhere else yet.  Should I bother checking values
> > that will be range checked later anyway?  Time zone displacements,
> > for example?
> 
> Good question.  Is strtol guaranteed to return INT_MAX or INT_MIN on
> overflow, or might it return the overflowed value?

The Open Group Base Specifications say this:
 Upon successful completion, these functions shall return the converted value, if any.  If no conversion could be
performed,0 shall be returned and errno may be set to [EINVAL].
 
 If the correct value is outside the range of representable values, {LONG_MIN}, {LONG_MAX}, {LLONG_MIN}, or {LLONG_MAX}
shallbe returned (according to the sign of the value), and errno set to [ERANGE].
 

http://www.opengroup.org/onlinepubs/009695399/functions/strtol.html

FreeBSD and Solaris both peg overflow at LONG_MAX, and that behavior
is what I noticed in the first place.  I don't know if any systems
behave otherwise.  Alvaro suggested checking for both LONG_MAX and
ERANGE; I suppose if we check for LONG_MAX then we should also check
for LONG_MIN.  I don't know if any systems might set ERANGE in a
non-error situation.

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Tom Lane
Date:
Michael Fuhr <mike@fuhr.org> writes:
> I suppose if we check for LONG_MAX then we should also check
> for LONG_MIN.

s/should/must/, which makes the code even more complicated, in order to
buy what exactly?

> I don't know if any systems might set ERANGE in a non-error situation.

The SUS saith
http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html
The strtol() function will not change the setting of errno ifsuccessful.

Perhaps more to the point, we've been doing it that way (errno test
only) for many years without complaints.  Adding a test on the return
value is venturing into less charted waters.
        regards, tom lane


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Wed, Nov 30, 2005 at 05:23:23PM -0500, Tom Lane wrote:
> Michael Fuhr <mike@fuhr.org> writes:
> > I usually check both in my own code but I noticed several places
> > where PostgreSQL doesn't, so I kept that style.  I'll check both
> > if that's preferred.
> 
> I'd say not --- it's more code and it makes a possibly unwarranted
> assumption about strtol's behavior.

OTOH, it might be an unwarranted assumption that ERANGE alone
indicates error.  It's possible that on some system errno's value
is meaningless unless strtol() returns one of the documented
indicators (LONG_MAX or LONG_MIN).  I've seen system calls that
behave that way: errno might get set in a non-error situation due
to the underlying implementation (e.g., wrappers around socket
functions in userland thread implementations), but the programmer
has no business looking at errno unless the function returns -1.

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Wed, Nov 30, 2005 at 05:49:53PM -0500, Tom Lane wrote:
> The SUS saith
> http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html
> 
>     The strtol() function will not change the setting of errno if
>     successful.
> 
> Perhaps more to the point, we've been doing it that way (errno test
> only) for many years without complaints.  Adding a test on the return
> value is venturing into less charted waters.

Good, I'll stick with just the ERANGE check then.

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Michael Fuhr <mike@fuhr.org> writes:
>  
>
>>I suppose if we check for LONG_MAX then we should also check
>>for LONG_MIN.
>>    
>>
>
>s/should/must/, which makes the code even more complicated, in order to
>buy what exactly?
>
>  
>
>>I don't know if any systems might set ERANGE in a non-error situation.
>>    
>>
>
>The SUS saith
>http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html
>
>    The strtol() function will not change the setting of errno if
>    successful.
>
>Perhaps more to the point, we've been doing it that way (errno test
>only) for many years without complaints.  Adding a test on the return
>value is venturing into less charted waters.
>
>
>  
>

LONG_MIN/LONG_MAX might be the actual values provided, too, mightn't 
they? checking for ERANGE seems like the only viable test.

cheers

andrew


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Wed, Nov 30, 2005 at 06:00:07PM -0500, Andrew Dunstan wrote:
> LONG_MIN/LONG_MAX might be the actual values provided, too, mightn't 
> they? checking for ERANGE seems like the only viable test.

Errno needs to be checked in any case for just that reason; the
question was whether checking *only* errno is sufficient to detect
an error.  According to the standard it is.

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
Hmmm...is this something else that needs fixing?  The doc says dates
range from 4713 BC to 32767 AD.

test=> select '11754179-08-04'::date;     date      
----------------11754179-08-04
(1 row)

test=> select '11754179-08-05'::date;    date      
---------------4801-01-01 BC
(1 row)

-- 
Michael Fuhr


Re: Strange interval arithmetic

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Michael Fuhr <mike@fuhr.org> writes:
> > I usually check both in my own code but I noticed several places
> > where PostgreSQL doesn't, so I kept that style.  I'll check both
> > if that's preferred.
> 
> I'd say not --- it's more code and it makes a possibly unwarranted
> assumption about strtol's behavior.
> 

Generally speaking looking at errno when you haven't received an error return
from a libc function is asking for trouble. It could be leftover from any
previous libc error. 

That's how you get programs saying things like "strtol: No such file or
directory" ...

   The strtol() function returns the result of the conversion, unless the value   would underflow or overflow. If an
underflowoccurs, strtol() returns   LONG_MIN. If an overflow occurs, strtol() returns LONG_MAX. In both cases,   errno
isset to ERANGE. Precisely the same holds for strtoll() (with LLONG_MIN   and LLONG_MAX instead of LONG_MIN and
LONG_MAX).


-- 
greg



Re: Strange interval arithmetic

From
Greg Stark
Date:
> Generally speaking looking at errno when you haven't received an error return
> from a libc function is asking for trouble. It could be leftover from any
> previous libc error. 

Oh, sorry, just reread the code snippet and see that isn't an issue. nevermind.

-- 
greg



Re: Strange interval arithmetic

From
Greg Stark
Date:
Greg Stark <gsstark@MIT.EDU> writes:

> Generally speaking looking at errno when you haven't received an error return
> from a libc function is asking for trouble. It could be leftover from any
> previous libc error. 
> 
> That's how you get programs saying things like "strtol: No such file or
> directory" ...

Ah, I take back my taking back of this. It's still dicey to be doing it this
way -- even if you reset errno before calling the library function.

The problem is that the function can call other libc functions, which may in
turn return errors. But these errors might be incidental and handled by the
function you're calling.

The typical case of this is calling printf which first calls isatty(). printf
then returns success but leaves errno set to ENOTTY. And programs that check
errno without checking the return valule -- even if they clear it before
calling printf -- mysteriously print "Not a Typewriter" after correctly
printing the data.

The SUS standard protects against this though by only allowing this to happen
for functions that don't themselves use errno to signal errors:
 The value of errno may be set to nonzero by a library function call whether or not there is an error, provided the use
oferrno is not documented in the description of the function in this International Standard.
 

Older platforms may still have this behaviour, but strtol seems like a pretty
innocuous case. It's hard to imagine strtol needing to call much else. And
strtol was an ANSI addition so one imagines most platforms got it right from
the beginning.

-- 
greg



Re: Strange interval arithmetic

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Ah, I take back my taking back of this. It's still dicey to be doing it this
> way -- even if you reset errno before calling the library function.

See the rest of the thread.  The I-believe-what-I-read-in-the-spec camp
may take comfort from what it says in the SUS strtol spec,
http://www.opengroup.org/onlinepubs/007908799/xsh/strtol.html
The strtol() function will not change the setting of errno ifsuccessful.

Those who are disinclined to assume that implementations always follow
the spec may take comfort from the fact that we've been testing errno
only (without also inspecting the return value) in pg_atoi for years
without problems.

From neither point of view do I see an argument for adding a test on the
return value.  It's unnecessary according to the spec, and if you
disbelieve that strtol follows the spec, you should also be suspicious
about whether it's guaranteed to return LONG_MAX/LONG_MIN upon overflow
failure.  We have no track record that gives evidence that the latter is
universally true.
        regards, tom lane


Re: Strange interval arithmetic

From
Michael Fuhr
Date:
On Thu, Dec 01, 2005 at 03:31:41PM -0500, Greg Stark wrote:
> Greg Stark <gsstark@MIT.EDU> writes:
> > Generally speaking looking at errno when you haven't received an error return
> > from a libc function is asking for trouble. It could be leftover from any
> > previous libc error. 
> > 
> > That's how you get programs saying things like "strtol: No such file or
> > directory" ...
> 
> Ah, I take back my taking back of this. It's still dicey to be doing it this
> way -- even if you reset errno before calling the library function.
> 
> The problem is that the function can call other libc functions, which may in
> turn return errors. But these errors might be incidental and handled by the
> function you're calling.

I had that concern, as I've seen such incidental errno changes
before.  But Tom pointed out the following from SUS:
 The strtol() function shall not change the setting of errno if successful.

Immediately after that the standard says:
 Since 0, {LONG_MIN} or {LLONG_MIN}, and {LONG_MAX} or {LLONG_MAX} are returned on error and are also valid returns on
success,an application wishing to check for error situations should set errno to 0, then call strtol() or strtoll(),
thencheck errno.
 

I don't know if any systems are non-compliant in this respect, but
Tom said that "we've been doing it that way (errno test only) for
many years without complaints."

-- 
Michael Fuhr