Re: Infinite Interval - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Infinite Interval
Date
Msg-id CAExHW5vQ1zY03794eyFrMGapLOKHcHydLaE8jG7WDtq7qnba-A@mail.gmail.com
Whole thread Raw
In response to Re: Infinite Interval  (Dean Rasheed <dean.a.rasheed@gmail.com>)
List pgsql-hackers
On Fri, Oct 27, 2023 at 2:07 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Wed, 4 Oct 2023 at 14:29, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > I think we should introduce interval_out_of_range_error() function on
> > the lines of float_overflow_error(). Later patches introduce more
> > places where we raise that error. We can introduce the function as
> > part of those patches.
> >
>
> I'm not convinced that it is really worth it. Note also that even with
> this patch, there are still more places that throw "timestamp out of
> range" errors than "interval out of range" errors.

Fine with me.

>
> > > 4. I tested pg_upgrade on a table with an interval with INT_MAX
> > > months, and it was silently converted to infinity. I think that's
> > > probably the best outcome (better than failing). However, this means
> > > that we really should require all 3 fields of an interval to be
> > > INT_MIN/MAX for it to be considered infinite, otherwise it would be
> > > possible to have multiple internal representations of infinity that do
> > > not compare as equal.
> > >
> > My first patch was comparing all the three fields to determine whether
> > a given Interval value represents infinity. [3] changed that to use
> > only the month field. I guess that was based on the discussion at [4].
> > You may want to review that discussion if not already done. I am fine
> > either way. We should be able to change the comparison code later if
> > we see performance getting impacted.
> >
>
> Before looking at the details more closely, I might have agreed with
> that earlier discussion. However, given that things like pg_upgrade
> have the possibility of turning formerly allowed, finite intervals
> into infinity, we really need to ensure that there is only one value
> equal to infinity, otherwise the results are likely to be very
> confusing and counter-intuitive. That means that we have to continue
> to regard intervals like INT32_MAX months + 10 days as finite.
>
> While I haven't done any performance testing, I wouldn't expect this
> to have much impact. In a 64-bit build, this actually generates 2
> comparisons rather than 3 -- one comparing the combined month and day
> fields against a 64-bit value containing 2 copies of INT32_MAX, and
> one testing the time field. In practice, only the first test will be
> executed in the vast majority of cases.
>

Thanks for the analysis.

>
> Something that perhaps does need discussing is the fact that
> '2147483647 months 2147483647 days 9223372036854775807 usecs' is now
> accepted by interval_in() and gives infinity. That's a bit ugly, but I
> think it's defensible as a measure to prevent dump/restore errors from
> older databases, and in any case, such an interval is outside the
> documented range of supported intervals, and is a highly contrived
> example, vanishingly improbable in practice.

Agreed.

>
> Alternatively, we could have interval_in() reject this, which would
> open up the possibility of dump/restore errors. It could be argued
> that that's OK, for similar reasons -- the failing value is highly
> unlikely/contrived, and out of the documented range. I don't like that
> though. I don't think dump/restore should fail under any
> circumstances, however unlikely.

I agree that dump/restore shouldn't fail, especially when restore on
one major version succeeds and fails on another.

>
> Another alternative is to accept this input, but emit a WARNING. I
> don't particularly like that either, since it's forcing a check on
> every input value, just to cater for this one specific highly unlikely
> input. In fact, both these alternative approaches (rejecting the
> value, or emitting a warning), would impose a small performance
> penalty on every interval input, which I don't think is really worth
> it.

Agreed.

>
> So overall, my preference is to just accept it. Anything else is more
> work, for no practical benefit.
>

Ok.

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: pg_upgrade and logical replication
Next
From: Andrew Dunstan
Date:
Subject: Re: run pgindent on a regular basis / scripted manner