Re: BRIN minmax multi - incorrect distance for infinite timestamp/date - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: BRIN minmax multi - incorrect distance for infinite timestamp/date |
Date | |
Msg-id | dd7fcf77-414b-b4b2-358a-0fc505255fba@enterprisedb.com Whole thread Raw |
In response to | Re: BRIN minmax multi - incorrect distance for infinite timestamp/date (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Responses |
Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
|
List | pgsql-hackers |
On 10/16/23 11:25, Ashutosh Bapat wrote: > Thanks Tomas for bringing this discussion to hackers. > > > On Fri, Oct 13, 2023 at 8:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: >> >> On Fri, 13 Oct 2023 at 13:17, Tomas Vondra >> <tomas.vondra@enterprisedb.com> wrote: >>> >>> I do plan to backpatch this, yes. I don't think there are many people >>> affected by this (few people are using infinite dates/timestamps, but >>> maybe the overflow could be more common). >>> > > The example you gave is missing CREATE INDEX command. Is it "create > index test_idx_a on test using brin(a);" Ah, you're right - apologies. FWIW when testing I usually use 1-page ranges, to possibly hit more combinations / problems. More importantly, it needs to specify the opclass, otherwise it'll use the default minmax opclass which does not use distance at all: create index test_idx_a on test using brin (a timestamptz_minmax_multi_ops) with (pages_per_range=1); > > Do already create indexes have this issue? Do they need to rebuilt > after upgrading? > Yes, existing indexes will have inefficient ranges. I'm not sure we want to push people to reindex everything, the issue seem somewhat unlikely in practice. >> >> OK, though I doubt that such values are common in practice. >> >> There's also an overflow problem in >> brin_minmax_multi_distance_interval() though, and I think that's worse >> because overflows there throw "interval out of range" errors, which >> can prevent index creation or inserts. >> >> There's a patch (0009 in [1]) as part of the infinite interval work, >> but that just uses interval_mi(), and so doesn't fix the >> interval-out-of-range errors, except for infinite intervals, which are >> treated as special cases, which I don't think is really necessary. >> > > Right. I used interval_mi() to preserve the finite value behaviour as > is. But ... > >> I think this should be rewritten to compute delta from ia and ib >> without going via an intermediate Interval value. I.e., instead of >> computing "result", just do something like >> >> dayfraction = (ib->time % USECS_PER_DAY) - (ia->time % USECS_PER_DAY); >> days = (ib->time / USECS_PER_DAY) - (ia->time / USECS_PER_DAY); >> days += (int64) ib->day - (int64) ia->day; >> days += ((int64) ib->month - (int64) ia->month) * INT64CONST(30); >> >> then convert to double precision as it does now: >> >> delta = (double) days + dayfraction / (double) USECS_PER_DAY; >> > > Given Tomas's explanation of how these functions are supposed to work, > I think your suggestions is better. > > I was worried that above calculations may not produce the same result > as the current code when there is no error because modulo and integer > division are not distributive over subtraction. But it looks like > together they behave as normal division which is distributive over > subtraction. I couldn't find an example where that is not true. > > Tomas, you may want to incorporate this in your patch. If not, I will > incorporate it in my infinite interval patchset in [1]. > I'd rather keep it as separate patch, although maybe let's deal with it separately from the larger patches. It's a bug, and having it in a patch set that adds a feature does not seem like a good idea (or maybe I don't understand what the other thread does, I haven't looked very closely). regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: