Thread: btree_gin: Incorrect leftmost interval value

btree_gin: Incorrect leftmost interval value

From
Dean Rasheed
Date:
In contrib/btree_gin, leftmostvalue_interval() does this:

leftmostvalue_interval(void)
{
    Interval   *v = palloc(sizeof(Interval));

    v->time = DT_NOBEGIN;
    v->day = 0;
    v->month = 0;
    return IntervalPGetDatum(v);
}

which is a long way short of the minimum possible interval value.

As a result, a < or <= query using a GIN index on an interval column
may miss values. For example:

CREATE EXTENSION btree_gin;
CREATE TABLE foo (a interval);
INSERT INTO foo VALUES ('-1000000 years');
CREATE INDEX foo_idx ON foo USING gin (a);

SET enable_seqscan = off;
SELECT * FROM foo WHERE a < '1 year';
 a
---
(0 rows)

Attached is a patch fixing this by setting all the fields to their
minimum values, which is guaranteed to be less than any other
interval.

Note that this doesn't affect the contents of the index itself, so
reindexing is not necessary.

Regards,
Dean

Attachment

Re: btree_gin: Incorrect leftmost interval value

From
Heikki Linnakangas
Date:
On 27/10/2023 12:26, Dean Rasheed wrote:
> In contrib/btree_gin, leftmostvalue_interval() does this:
> 
> leftmostvalue_interval(void)
> {
>      Interval   *v = palloc(sizeof(Interval));
> 
>      v->time = DT_NOBEGIN;
>      v->day = 0;
>      v->month = 0;
>      return IntervalPGetDatum(v);
> }
> 
> which is a long way short of the minimum possible interval value.

Good catch!

> Attached is a patch fixing this by setting all the fields to their
> minimum values, which is guaranteed to be less than any other
> interval.

LGTM. I wish extractQuery could return "leftmost" more explicitly, so 
that we didn't need to construct these leftmost values. But I don't 
think that's supported by the current extractQuery interface.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: btree_gin: Incorrect leftmost interval value

From
Ashutosh Bapat
Date:
On Fri, Oct 27, 2023 at 2:57 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> In contrib/btree_gin, leftmostvalue_interval() does this:
>
> leftmostvalue_interval(void)
> {
>     Interval   *v = palloc(sizeof(Interval));
>
>     v->time = DT_NOBEGIN;
>     v->day = 0;
>     v->month = 0;
>     return IntervalPGetDatum(v);
> }
>
> which is a long way short of the minimum possible interval value.
>
> As a result, a < or <= query using a GIN index on an interval column
> may miss values. For example:
>
> CREATE EXTENSION btree_gin;
> CREATE TABLE foo (a interval);
> INSERT INTO foo VALUES ('-1000000 years');
> CREATE INDEX foo_idx ON foo USING gin (a);
>
> SET enable_seqscan = off;
> SELECT * FROM foo WHERE a < '1 year';
>  a
> ---
> (0 rows)
>
> Attached is a patch fixing this by setting all the fields to their
> minimum values, which is guaranteed to be less than any other
> interval.

Should we change this to call INTERVAL_NOBEGIN() to be added by
infinite interval patches? It's the same effect but looks similar to
leftmostvalue_timestamp/float8 etc. It will need to wait for the
infinite interval patches to commit but I guess, the wait won't be too
long and the outcome will be better. I can include this change in
those patches.

--
Best Wishes,
Ashutosh Bapat



Re: btree_gin: Incorrect leftmost interval value

From
Dean Rasheed
Date:
On Fri, 27 Oct 2023 at 13:56, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Should we change this to call INTERVAL_NOBEGIN() to be added by
> infinite interval patches?
>

Given that this is a bug that can lead to incorrect query results, I
plan to back-patch it, and INTERVAL_NOBEGIN() wouldn't make sense in
back-branches that don't have infinite intervals.

Regards,
Dean



Re: btree_gin: Incorrect leftmost interval value

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On Fri, 27 Oct 2023 at 13:56, Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
>> Should we change this to call INTERVAL_NOBEGIN() to be added by
>> infinite interval patches?

> Given that this is a bug that can lead to incorrect query results, I
> plan to back-patch it, and INTERVAL_NOBEGIN() wouldn't make sense in
> back-branches that don't have infinite intervals.

Agreed.  When/if the infinite interval patch lands, it could update
this function to use the macro.

            regards, tom lane