Re: BRIN minmax multi - incorrect distance for infinite timestamp/date - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
Date
Msg-id CAExHW5vcq6mJJRH3wPytDataDw+HZgs-E7b=e+SXpWgo7BoTsg@mail.gmail.com
Whole thread Raw
In response to Re: BRIN minmax multi - incorrect distance for infinite timestamp/date  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
List pgsql-hackers
On Mon, Oct 16, 2023 at 7:33 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 10/16/23 11:25, Ashutosh Bapat wrote:
> > Thanks Tomas for bringing this discussion to hackers.
> >
> >
> > On Fri, Oct 13, 2023 at 8:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> >>
> >> On Fri, 13 Oct 2023 at 13:17, Tomas Vondra
> >> <tomas.vondra@enterprisedb.com> wrote:
> >>>
> >>> I do plan to backpatch this, yes. I don't think there are many people
> >>> affected by this (few people are using infinite dates/timestamps, but
> >>> maybe the overflow could be more common).
> >>>
> >
> > The example you gave is missing CREATE INDEX command. Is it "create
> > index test_idx_a on test using brin(a);"
>
> Ah, you're right - apologies. FWIW when testing I usually use 1-page
> ranges, to possibly hit more combinations / problems. More importantly,
> it needs to specify the opclass, otherwise it'll use the default minmax
> opclass which does not use distance at all:
>
> create index test_idx_a on test
>  using brin (a timestamptz_minmax_multi_ops) with (pages_per_range=1);
>

Thanks.

> >
> > Do already create indexes have this issue? Do they need to rebuilt
> > after upgrading?
> >
>
> Yes, existing indexes will have inefficient ranges. I'm not sure we want
> to push people to reindex everything, the issue seem somewhat unlikely
> in practice.
>

If the column has infinity values only then they need to rebuild the
index. Such users may notice this bug fix in the release notes and
decide to rebuild the index themselves.

> >> I think this should be rewritten to compute delta from ia and ib
> >> without going via an intermediate Interval value. I.e., instead of
> >> computing "result", just do something like
> >>
> >>     dayfraction = (ib->time % USECS_PER_DAY) - (ia->time % USECS_PER_DAY);
> >>     days = (ib->time / USECS_PER_DAY) - (ia->time / USECS_PER_DAY);
> >>     days += (int64) ib->day - (int64) ia->day;
> >>     days += ((int64) ib->month - (int64) ia->month) * INT64CONST(30);
> >>
> >> then convert to double precision as it does now:
> >>
> >>     delta = (double) days + dayfraction / (double) USECS_PER_DAY;
> >>
> >
> > Given Tomas's explanation of how these functions are supposed to work,
> > I think your suggestions is better.
> >
> > I was worried that above calculations may not produce the same result
> > as the current code when there is no error because modulo and integer
> > division are not distributive over subtraction. But it looks like
> > together they behave as normal division which is distributive over
> > subtraction. I couldn't find an example where that is not true.
> >
> > Tomas, you may want to incorporate this in your patch. If not, I will
> > incorporate it in my infinite interval patchset in [1].
> >
>
> I'd rather keep it as separate patch, although maybe let's deal with it
> separately from the larger patches. It's a bug, and having it in a patch
> set that adds a feature does not seem like a good idea (or maybe I don't
> understand what the other thread does, I haven't looked very closely).
>

If you incorporate these changes, I will need to remove 0009, which
mostly rewrites that function, from my patchset. If you don't, my
patch rewrites anyway. Either way is fine with me.

--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: Rename backup_label to recovery_control
Next
From: Erik Rijkers
Date:
Subject: event trigger sgml touch-up