Thread: btree_gin: Incorrect leftmost interval value
In contrib/btree_gin, leftmostvalue_interval() does this: leftmostvalue_interval(void) { Interval *v = palloc(sizeof(Interval)); v->time = DT_NOBEGIN; v->day = 0; v->month = 0; return IntervalPGetDatum(v); } which is a long way short of the minimum possible interval value. As a result, a < or <= query using a GIN index on an interval column may miss values. For example: CREATE EXTENSION btree_gin; CREATE TABLE foo (a interval); INSERT INTO foo VALUES ('-1000000 years'); CREATE INDEX foo_idx ON foo USING gin (a); SET enable_seqscan = off; SELECT * FROM foo WHERE a < '1 year'; a --- (0 rows) Attached is a patch fixing this by setting all the fields to their minimum values, which is guaranteed to be less than any other interval. Note that this doesn't affect the contents of the index itself, so reindexing is not necessary. Regards, Dean
Attachment
On 27/10/2023 12:26, Dean Rasheed wrote: > In contrib/btree_gin, leftmostvalue_interval() does this: > > leftmostvalue_interval(void) > { > Interval *v = palloc(sizeof(Interval)); > > v->time = DT_NOBEGIN; > v->day = 0; > v->month = 0; > return IntervalPGetDatum(v); > } > > which is a long way short of the minimum possible interval value. Good catch! > Attached is a patch fixing this by setting all the fields to their > minimum values, which is guaranteed to be less than any other > interval. LGTM. I wish extractQuery could return "leftmost" more explicitly, so that we didn't need to construct these leftmost values. But I don't think that's supported by the current extractQuery interface. -- Heikki Linnakangas Neon (https://neon.tech)
On Fri, Oct 27, 2023 at 2:57 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > In contrib/btree_gin, leftmostvalue_interval() does this: > > leftmostvalue_interval(void) > { > Interval *v = palloc(sizeof(Interval)); > > v->time = DT_NOBEGIN; > v->day = 0; > v->month = 0; > return IntervalPGetDatum(v); > } > > which is a long way short of the minimum possible interval value. > > As a result, a < or <= query using a GIN index on an interval column > may miss values. For example: > > CREATE EXTENSION btree_gin; > CREATE TABLE foo (a interval); > INSERT INTO foo VALUES ('-1000000 years'); > CREATE INDEX foo_idx ON foo USING gin (a); > > SET enable_seqscan = off; > SELECT * FROM foo WHERE a < '1 year'; > a > --- > (0 rows) > > Attached is a patch fixing this by setting all the fields to their > minimum values, which is guaranteed to be less than any other > interval. Should we change this to call INTERVAL_NOBEGIN() to be added by infinite interval patches? It's the same effect but looks similar to leftmostvalue_timestamp/float8 etc. It will need to wait for the infinite interval patches to commit but I guess, the wait won't be too long and the outcome will be better. I can include this change in those patches. -- Best Wishes, Ashutosh Bapat
On Fri, 27 Oct 2023 at 13:56, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > Should we change this to call INTERVAL_NOBEGIN() to be added by > infinite interval patches? > Given that this is a bug that can lead to incorrect query results, I plan to back-patch it, and INTERVAL_NOBEGIN() wouldn't make sense in back-branches that don't have infinite intervals. Regards, Dean
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On Fri, 27 Oct 2023 at 13:56, Ashutosh Bapat > <ashutosh.bapat.oss@gmail.com> wrote: >> Should we change this to call INTERVAL_NOBEGIN() to be added by >> infinite interval patches? > Given that this is a bug that can lead to incorrect query results, I > plan to back-patch it, and INTERVAL_NOBEGIN() wouldn't make sense in > back-branches that don't have infinite intervals. Agreed. When/if the infinite interval patch lands, it could update this function to use the macro. regards, tom lane