Re: Infinite Interval - Mailing list pgsql-hackers

From Joseph Koshakow
Subject Re: Infinite Interval
Date
Msg-id CAAvxfHdNDLJZJoPcAo_52sLxEMhqhm8-Bod31qF36puUtJZVnQ@mail.gmail.com
Whole thread Raw
In response to Re: Infinite Interval  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: Infinite Interval  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Infinite Interval  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List pgsql-hackers


On Thu, Mar 9, 2023 at 12:42 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
>
>
>     static pg_tz *FetchDynamicTimeZone(TimeZoneAbbrevTable *tbl, const datetkn *tp,
>    -                                   DateTimeErrorExtra *extra);
>    +                                   DateTimeErrorExtra * extra);
>
>    There are a lot of these diffs. PG code doesn't leave an extra space between
>    variable name and *.

Those appeared from running pg_indent. I've removed them all.

>         /* Handle the integer part */
>    -    if (!int64_multiply_add(val, scale, &itm_in->tm_usec))
>    +    if (pg_mul_add_s64_overflow(val, scale, &itm_in->tm_usec))
>
>    I think this is a good change, since we are moving the function to int.h where
>    it belongs. We could separate these kind of changes into another patch for easy
>    review.

I've separated this out into another patch attached to this email.
Should I start a new email thread or is it ok to include it in this
one?

>    +
>    +    result->day = days;
>    +    if (pg_mul_add_s32_overflow(weeks, 7, &result->day))
>    +        ereport(ERROR,
>    +                (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    +                 errmsg("interval out of range")));
>
>    I think such changes are also good, but probably a separate patch for ease of
>    review.

I've separated a patch for this too, which I've also included in this
email.

>         secs = rint(secs * USECS_PER_SEC);
>    -    result->time = hours mj* ((int64) SECS_PER_HOUR * USECS_PER_SEC) +
>    -        mins * ((int64) SECS_PER_MINUTE * USECS_PER_SEC) +
>    -        (int64) secs;
>    +
>    +    result->time = secs;
>    +    if (pg_mul_add_s64_overflow(mins, ((int64) SECS_PER_MINUTE *
>    USECS_PER_SEC), &result->time))
>    +        ereport(ERROR,
>    +                (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    +                 errmsg("interval out of range")));
>    +    if (pg_mul_add_s64_overflow(hours, ((int64) SECS_PER_HOUR *
>    USECS_PER_SEC), &result->time))
>    +        ereport(ERROR,
>    +                (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    +                 errmsg("interval out of range")));
>
>        shouldn't this be
>        secs = rint(secs);
>        result->time = 0;
>        pg_mul_add_s64_overflow(secs, USECS_PER_SEC, &result->time) to catch
>        overflow error early?

The problem is that `secs = rint(secs)` rounds the seconds too early
and loses any fractional seconds. Do we have an overflow detecting
multiplication function for floats?

>    +        if TIMESTAMP_IS_NOBEGIN
>    +            (dt2)
>
>    Better be written as if (TIMESTAMP_IS_NOBEGIN(dt2))? There are more corrections
>    like this.

I think this may have also been done by pg_indent, I've reverted all
the examples of this.

>    +    if (INTERVAL_NOT_FINITE(result))
>    +        ereport(ERROR,
>    +                (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    +                 errmsg("interval out of range")));
>
>    Probably, I added these kind of checks. But I don't remember if those are
>    defensive checks or whether it's really possible that the arithmetic above
>    these lines can yield an non-finite interval.

These checks appear in `make_interval`, `justify_X`,
`interval_um_internal`, `interval_pl`, `interval_mi`, `interval_mul`,
`interval_div`. For all of these it's possible that the interval
overflows/underflows the non-finite ranges, but does not
overflow/underflow the data type. For example
`SELECT INTERVAL '2147483646 months' + INTERVAL '1 month'` would error
on this check.


>    +    else
>    +    {
>    +        result->time = -interval->time;
>    +        result->day = -interval->day;
>    +        result->month = -interval->month;
>    +
>    +        if (INTERVAL_NOT_FINITE(result))
>    +            ereport(ERROR,
>    +                    (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    +                     errmsg("interval out of range")));
>
>    If this error ever gets to the user, it could be confusing. Can we elaborate by
>    adding context e.g. errcontext("while negating an interval") or some such?

Done.

>    -
>    -    result->time = -interval->time;
>    -    /* overflow check copied from int4um */
>    -    if (interval->time != 0 && SAMESIGN(result->time, interval->time))
>    -        ereport(ERROR,
>    -                (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    -                 errmsg("interval out of range")));
>    -    result->day = -interval->day;
>    -    if (interval->day != 0 && SAMESIGN(result->day, interval->day))
>    -        ereport(ERROR,
>    -                (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    -                 errmsg("interval out of range")));
>    -    result->month = -interval->month;
>    -    if (interval->month != 0 && SAMESIGN(result->month, interval->month))
>    -        ereport(ERROR,
>    -                (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
>    -                 errmsg("interval out of range")));
>    +    interval_um_internal(interval, result);
>
>    Shouldn't we incorporate these checks in interval_um_internal()? I don't think
>    INTERVAL_NOT_FINITE() covers all of those.

I replaced these checks with the following:

+ else if (interval->time == PG_INT64_MIN || interval->day == PG_INT32_MIN || interval->month == PG_INT32_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+ errmsg("interval out of range")));

I think this covers the same overflow check but is maybe a bit more
obvious. Unless, there's something I'm missing?

>    +    /*
>    +     * Subtracting two infinite intervals with different signs results in an
>    +     * infinite interval with the same sign as the left operand. Subtracting
>    +     * two infinte intervals with the same sign results in an error.
>    +     */
>
>    I think we need someone to validate these assumptions and similar assumptions
>    in interval_pl(). Googling gives confusing results in some cases. I have not
>    looked for IEEE standard around this specificallly.

I used to Python and Rust to check my assumptions on the IEEE standard:

Python:
>>> float('inf') + float('inf')
inf
>>> float('-inf') + float('inf')
nan
>>> float('inf') + float('-inf')
nan
>>> float('-inf') + float('-inf')
-inf

>>> float('inf') - float('inf')
nan
>>> float('-inf') - float('inf')
-inf
>>> float('inf') - float('-inf')
inf
>>> float('-inf') - float('-inf')
nan

Rust:
inf + inf = inf
-inf + inf = NaN
inf + -inf = NaN
-inf + -inf = -inf

inf - inf = NaN
-inf - inf = -inf
inf - -inf = inf
-inf - -inf = NaN

I'll try and look up the actual standard and see what it says.

>    +    if (INTERVAL_NOT_FINITE(interval))
>    +    {
>    +        double        r = NonFiniteIntervalPart(type, val, lowunits,
>    +                                              INTERVAL_IS_NOBEGIN(interval),
>    +                                              false);
>    +
>    +        if (r)
>
>    I see that this code is very similar to the corresponding code in timestamp and
>    timestamptz, so it's bound to be correct. But I always thought float equality
>    is unreliable. if (r) is equivalent to if (r == 0.0) so it will not work as
>    intended. But may be (float) 0.0 is a special value for which equality holds
>    true.

I'm not familiar with float equality being unreliable, but I'm by no
means a C or float expert. Can you link me to some docs/explanation?

>    +static inline bool
>    +pg_mul_add_s64_overflow(int64 val, int64 multiplier, int64 *sum)
>
>    I think this needs a prologue similar to int64_multiply_add(), that the patch
>    removes. Similarly for pg_mul_add_s32_overflow().

I've added this to the first patch.

Thanks for the review! Sorry for the delayed response.

- Joe Koshakow
Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
Next
From: Tom Lane
Date:
Subject: Re: Infinite Interval