Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: Add a filed to PageHeaderData) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: Add a filed to PageHeaderData)
Date
Msg-id 53AB0263.5000807@vmware.com
Whole thread Raw
In response to Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: Add a filed to PageHeaderData)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: Add a filed to PageHeaderData)  (Pavan Deolasee <pavan.deolasee@gmail.com>)
List pgsql-hackers
On 06/24/2014 11:22 PM, Heikki Linnakangas wrote:
> The real bug is in spg_range_quad_inner_consistent(), for the adjacent
> operator. Things go wrong when:
>
> The scan key is [100, 500)
> The prev centroid is [500, 510)
> The current centroid is [544, 554).
>
> The row that should match but isn't returned, [500, 510) is equal to the
> previous centroid. It's in quadrant 3 from the current centroid, but
> spg_range_quad_inner_consistent() incorrectly concludes that it doesn't
> need to scan that quadrant.
>
> The function compares the scan key's upper bound with the the previous
> centroid's lower bound and the current centroid's lower bound:
>
>> /*
>>   * Check if upper bound of argument is not in a
>>   * quadrant we visited in the previous step.
>>   */
>> cmp1 = range_cmp_bounds(typcache, &upper, &prevLower);
>> cmp2 = range_cmp_bounds(typcache, ¢roidLower,
>>             &prevLower);
>> if ((cmp2 < 0 && cmp1 > 0) || (cmp2 > 0 && cmp1 < 0))
>>      which2 = 0;
>
> The idea is that if the scan key's upper bound doesn't fall between the
> prev and current centroid's lower bounds, there is no match.
>
>     *   *    *
>    PL   X    CL
>
> X = scan key's upper bound: 500)
> PL = prev centroid's lower bound [500
> CL = current centroid's lower bound [500
>
> This is wrong. X < PL, but it's still nevertheless adjacent to it.
>
> I'll take a closer look tomorrow...
>
> (The "if (which2) ..." block after the code I quoted above also looks
> wrong - it seems to be comparing the argument's lower bound when it
> should be comparing the upper bound according to the comment. )

I came up with the attached. There were several bugs:

* The "if (which2) { ... }"  block was broken. It compared the
argument's lower bound against centroid's upper bound, while it was
supposed to compare the argument's upper bound against the centroid's
lower bound (the comment was right). Also, it clear bits in the "which1"
variable, while it was supposed to clear bits in "which2". ISTM it was
copy-pasted from the if (which1) { ... }" block just before it, but not
modified.

* If the argument's upper bound was equal to the centroid's lower bound,
we descended to both halves (= all quadrants). That's unnecessary.
Imagine that the argument is (x, 500), and the centroid is (500, y), so
that the bounds are equal. The adjacent ranges that we're trying to find
would be of form [500, z), which are to the right of the centroid. There
is no need to visit the left quadrants. This won't lead to incorrect
query results, but slows down queries unnecessarily.

* In the case that argument's lower bound is adjacent to the centroid's
upper bound, we also don't need to visit all quadrants. Per similar
reasoning as previous point.

* The code where we compare the previous centroid with the current
centroid should match the code where we compare the current centroid
with the argument. The point of that code is to redo the calculation
done in the previous level, to see if we were supposed to traverse left
or right (or up or down), and if we actually did. If we moved in the
different direction, then we know there are no matches for bound.

Those could be fixed with quite small adjustments, but I think the code
needs some refactoring. It's complicated as it is, it's very difficult
to understand all the cases and comparisons. Case in point: the patch
was written by Alexander, reviewed by Jeff, and committed by me, and we
all missed the above bugs. So, I propose the attached.

I also wrote the attached little white-box testing tool for this. It
allows you to call spg_range_quad_inner_consistent with the adjacent
strategy, and pass the exact argument, centroid and prev centroid ranges
you want. It prints out the result of which quadrants to visit.

- Heikki


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_filedump for 9.4?
Next
From: Andres Freund
Date:
Subject: better atomics - v0.5