Re: BRIN minmax multi - incorrect distance for infinite timestamp/date - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
Date
Msg-id dd7fcf77-414b-b4b2-358a-0fc505255fba@enterprisedb.com
Whole thread Raw
In response to Re: BRIN minmax multi - incorrect distance for infinite timestamp/date  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Responses Re: BRIN minmax multi - incorrect distance for infinite timestamp/date
List pgsql-hackers
On 10/16/23 11:25, Ashutosh Bapat wrote:
> Thanks Tomas for bringing this discussion to hackers.
> 
> 
> On Fri, Oct 13, 2023 at 8:58 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>>
>> On Fri, 13 Oct 2023 at 13:17, Tomas Vondra
>> <tomas.vondra@enterprisedb.com> wrote:
>>>
>>> I do plan to backpatch this, yes. I don't think there are many people
>>> affected by this (few people are using infinite dates/timestamps, but
>>> maybe the overflow could be more common).
>>>
> 
> The example you gave is missing CREATE INDEX command. Is it "create
> index test_idx_a on test using brin(a);"

Ah, you're right - apologies. FWIW when testing I usually use 1-page
ranges, to possibly hit more combinations / problems. More importantly,
it needs to specify the opclass, otherwise it'll use the default minmax
opclass which does not use distance at all:

create index test_idx_a on test
 using brin (a timestamptz_minmax_multi_ops) with (pages_per_range=1);

> 
> Do already create indexes have this issue? Do they need to rebuilt
> after upgrading?
> 

Yes, existing indexes will have inefficient ranges. I'm not sure we want
to push people to reindex everything, the issue seem somewhat unlikely
in practice.

>>
>> OK, though I doubt that such values are common in practice.
>>
>> There's also an overflow problem in
>> brin_minmax_multi_distance_interval() though, and I think that's worse
>> because overflows there throw "interval out of range" errors, which
>> can prevent index creation or inserts.
>>
>> There's a patch (0009 in [1]) as part of the infinite interval work,
>> but that just uses interval_mi(), and so doesn't fix the
>> interval-out-of-range errors, except for infinite intervals, which are
>> treated as special cases, which I don't think is really necessary.
>>
> 
> Right. I used interval_mi() to preserve the finite value behaviour as
> is. But ...
> 
>> I think this should be rewritten to compute delta from ia and ib
>> without going via an intermediate Interval value. I.e., instead of
>> computing "result", just do something like
>>
>>     dayfraction = (ib->time % USECS_PER_DAY) - (ia->time % USECS_PER_DAY);
>>     days = (ib->time / USECS_PER_DAY) - (ia->time / USECS_PER_DAY);
>>     days += (int64) ib->day - (int64) ia->day;
>>     days += ((int64) ib->month - (int64) ia->month) * INT64CONST(30);
>>
>> then convert to double precision as it does now:
>>
>>     delta = (double) days + dayfraction / (double) USECS_PER_DAY;
>>
> 
> Given Tomas's explanation of how these functions are supposed to work,
> I think your suggestions is better.
> 
> I was worried that above calculations may not produce the same result
> as the current code when there is no error because modulo and integer
> division are not distributive over subtraction. But it looks like
> together they behave as normal division which is distributive over
> subtraction. I couldn't find an example where that is not true.
> 
> Tomas, you may want to incorporate this in your patch. If not, I will
> incorporate it in my infinite interval patchset in [1].
> 

I'd rather keep it as separate patch, although maybe let's deal with it
separately from the larger patches. It's a bug, and having it in a patch
set that adds a feature does not seem like a good idea (or maybe I don't
understand what the other thread does, I haven't looked very closely).

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Add support for AT LOCAL
Next
From: Robert Haas
Date:
Subject: Re: Rename backup_label to recovery_control