Home > mailing lists

Re: Infinite Interval - Mailing list pgsql-hackers

From	Joseph Koshakow
Subject	Re: Infinite Interval
Date	March 18, 2023 21:48:33
Msg-id	CAAvxfHdNDLJZJoPcAo_52sLxEMhqhm8-Bod31qF36puUtJZVnQ@mail.gmail.com Whole thread Raw
In response to	Re: Infinite Interval (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses	Re: Infinite Interval (Tom Lane <tgl@sss.pgh.pa.us>) Re: Infinite Interval (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List	pgsql-hackers

Tree view

On Thu, Mar 9, 2023 at 12:42 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
>
>
> static pg_tz *FetchDynamicTimeZone(TimeZoneAbbrevTable *tbl, const datetkn *tp,
> - DateTimeErrorExtra *extra);
> + DateTimeErrorExtra * extra);
>
> There are a lot of these diffs. PG code doesn't leave an extra space between
> variable name and *.

Those appeared from running pg_indent. I've removed them all.

> /* Handle the integer part */
> - if (!int64_multiply_add(val, scale, &itm_in->tm_usec))
> + if (pg_mul_add_s64_overflow(val, scale, &itm_in->tm_usec))
>
> I think this is a good change, since we are moving the function to int.h where
> it belongs. We could separate these kind of changes into another patch for easy
> review.

I've separated this out into another patch attached to this email.
Should I start a new email thread or is it ok to include it in this
one?

> +
> + result->day = days;
> + if (pg_mul_add_s32_overflow(weeks, 7, &result->day))
> + ereport(ERROR,
> + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> + errmsg("interval out of range")));
>
> I think such changes are also good, but probably a separate patch for ease of
> review.

I've separated a patch for this too, which I've also included in this
email.

> secs = rint(secs * USECS_PER_SEC);
> - result->time = hours mj* ((int64) SECS_PER_HOUR * USECS_PER_SEC) +
> - mins * ((int64) SECS_PER_MINUTE * USECS_PER_SEC) +
> - (int64) secs;
> +
> + result->time = secs;
> + if (pg_mul_add_s64_overflow(mins, ((int64) SECS_PER_MINUTE *
> USECS_PER_SEC), &result->time))
> + ereport(ERROR,
> + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> + errmsg("interval out of range")));
> + if (pg_mul_add_s64_overflow(hours, ((int64) SECS_PER_HOUR *
> USECS_PER_SEC), &result->time))
> + ereport(ERROR,
> + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> + errmsg("interval out of range")));
>
> shouldn't this be
> secs = rint(secs);
> result->time = 0;
> pg_mul_add_s64_overflow(secs, USECS_PER_SEC, &result->time) to catch
> overflow error early?

The problem is that `secs = rint(secs)` rounds the seconds too early
and loses any fractional seconds. Do we have an overflow detecting
multiplication function for floats?

> + if TIMESTAMP_IS_NOBEGIN
> + (dt2)
>
> Better be written as if (TIMESTAMP_IS_NOBEGIN(dt2))? There are more corrections
> like this.

I think this may have also been done by pg_indent, I've reverted all
the examples of this.

> + if (INTERVAL_NOT_FINITE(result))
> + ereport(ERROR,
> + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> + errmsg("interval out of range")));
>
> Probably, I added these kind of checks. But I don't remember if those are
> defensive checks or whether it's really possible that the arithmetic above
> these lines can yield an non-finite interval.

These checks appear in `make_interval`, `justify_X`,
`interval_um_internal`, `interval_pl`, `interval_mi`, `interval_mul`,
`interval_div`. For all of these it's possible that the interval
overflows/underflows the non-finite ranges, but does not
overflow/underflow the data type. For example
`SELECT INTERVAL '2147483646 months' + INTERVAL '1 month'` would error
on this check.

> + else
> + {
> + result->time = -interval->time;
> + result->day = -interval->day;
> + result->month = -interval->month;
> +
> + if (INTERVAL_NOT_FINITE(result))
> + ereport(ERROR,
> + (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> + errmsg("interval out of range")));
>
> If this error ever gets to the user, it could be confusing. Can we elaborate by
> adding context e.g. errcontext("while negating an interval") or some such?

Done.

> -
> - result->time = -interval->time;
> - /* overflow check copied from int4um */
> - if (interval->time != 0 && SAMESIGN(result->time, interval->time))
> - ereport(ERROR,
> - (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> - errmsg("interval out of range")));
> - result->day = -interval->day;
> - if (interval->day != 0 && SAMESIGN(result->day, interval->day))
> - ereport(ERROR,
> - (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> - errmsg("interval out of range")));
> - result->month = -interval->month;
> - if (interval->month != 0 && SAMESIGN(result->month, interval->month))
> - ereport(ERROR,
> - (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
> - errmsg("interval out of range")));
> + interval_um_internal(interval, result);
>
> Shouldn't we incorporate these checks in interval_um_internal()? I don't think
> INTERVAL_NOT_FINITE() covers all of those.

I replaced these checks with the following:

+ else if (interval->time == PG_INT64_MIN || interval->day == PG_INT32_MIN || interval->month == PG_INT32_MIN)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+ errmsg("interval out of range")));

I think this covers the same overflow check but is maybe a bit more
obvious. Unless, there's something I'm missing?

> + /*
> + * Subtracting two infinite intervals with different signs results in an
> + * infinite interval with the same sign as the left operand. Subtracting
> + * two infinte intervals with the same sign results in an error.
> + */
>
> I think we need someone to validate these assumptions and similar assumptions
> in interval_pl(). Googling gives confusing results in some cases. I have not
> looked for IEEE standard around this specificallly.

I used to Python and Rust to check my assumptions on the IEEE standard:

Python:
>>> float('inf') + float('inf')
inf
>>> float('-inf') + float('inf')
nan
>>> float('inf') + float('-inf')
nan
>>> float('-inf') + float('-inf')
-inf

>>> float('inf') - float('inf')
nan
>>> float('-inf') - float('inf')
-inf
>>> float('inf') - float('-inf')
inf
>>> float('-inf') - float('-inf')
nan

Rust:
inf + inf = inf
-inf + inf = NaN
inf + -inf = NaN
-inf + -inf = -inf

inf - inf = NaN
-inf - inf = -inf
inf - -inf = inf
-inf - -inf = NaN

I'll try and look up the actual standard and see what it says.

> + if (INTERVAL_NOT_FINITE(interval))
> + {
> + double r = NonFiniteIntervalPart(type, val, lowunits,
> + INTERVAL_IS_NOBEGIN(interval),
> + false);
> +
> + if (r)
>
> I see that this code is very similar to the corresponding code in timestamp and
> timestamptz, so it's bound to be correct. But I always thought float equality
> is unreliable. if (r) is equivalent to if (r == 0.0) so it will not work as
> intended. But may be (float) 0.0 is a special value for which equality holds
> true.

I'm not familiar with float equality being unreliable, but I'm by no
means a C or float expert. Can you link me to some docs/explanation?

> +static inline bool
> +pg_mul_add_s64_overflow(int64 val, int64 multiplier, int64 *sum)
>
> I think this needs a prologue similar to int64_multiply_add(), that the patch
> removes. Similarly for pg_mul_add_s32_overflow().

I've added this to the first patch.

Thanks for the review! Sorry for the delayed response.

- Joe Koshakow

Attachment

pgsql-hackers by date:

From: Justin Pryzby
Date: 18 March 2023, 21:30:38
Subject: Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From: Tom Lane
Date: 18 March 2023, 22:08:33
Subject: Re: Infinite Interval

Re: Infinite Interval - Mailing list pgsql-hackers

Attachment

Previous

Next