On Tue, Jul 1, 2025 at 11:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> writes:
> > I tried the patch and it compiles and works as expected. Several minor
> > things I noticed:
>
> Thanks for looking at it!
>
> > 1) btree_gin.c
>
> > case BTEqualStrategyNumber:
> > if (cmp > 0)
> > res = -1; /* keep scanning */
> > else if (cmp == 0)
> > res = 0;
> > else
> > res = 1; /* end scan */
> > break;
>
> > I think the code is correct, but do we need to continue scanning here?
> > I can't think of an example where we have cmp == 0 after cmp > 0.
> > Maybe we can use cmp != 0 as a stopping condition?
>
> No, I think you're reading the code backward. cmp > 0 means the
> current index entry is less than the query's search value, so we
> need to keep scanning forward to see if there's any entries that
> match. We can stop when we get to larger entries, with cmp < 0.
>
Sorry, I think I wasn't clear enough. I agree with this logic, but I
think it implies an impossible scenario for the "equals" case. The
scenario where during a scan we first have keys that are less than
orig_datum, and then a key that is equal to orig_datum. Why I think
such a scenario is impossible: GIN uses partial_key as a lower bound
when positioning the start of a partial match scan. So if there is any
key in the index that is equal to "partial key", it must be the very
first key in the scan. Then if the very first key in the scan is less
than orig_datum, that means partial_key was also less than orig_datum
(because partial_key is a lower bound). And the only reason
partial_key might not be equal to orig_datum is that there is no value
equal to orig_datum in the index type. So we can say that if the very
first key in the scan is less than orig_datum, then there is no key in
the index that could be equal to orig_datum, and we can stop right
there. In fact, we can catch this right after converting from
orig_datum to entry_datum, but it would make the code more complex. I
think it's similar to what you wrote in the comment about out of range
values.
> I found this sign convention confusing too, and considered reversing
> the comparison call so that the "cmp" comparisons in this function
> would all flip. But I felt that such a change maybe doesn't belong
> in this patch. Also I wasn't sure other people would agree that
> it'd be an improvement --- the original code author, for one,
> presumably finds this natural.
>
Comments helped me a lot with all these cmp branches.
> >> * Query value falls between possible values of the index type
> >> (possible in float8->float4 or timestamp->date cases, for example).
> >> We can just use our usual conversion rules, though. The critical
> >> observation here is that it does not matter whether the conversion
> >> rounds to the next lower or next higher possible value.
>
> > I agree with the statements. It's quite a tricky part as for me,
> > probably it's better to add tests for this special case as it's done
> > for 'out of range' cases. FWIW while testing the patch I wrote some
> > tests for these rounding cases, I'm ready to add it to the patchset.
>
> Makes sense. Do you want to prepare the next patch version, then?
>
> regards, tom lane
Yeah, here is a new version with tests for rounding cases.
Best regards,
Arseniy Mukhin