Thread: [HACKERS] [PATCH]: fix bug in SP-GiST box_ops
Working on the tests for new SP-GiST opclasses for polygons and circles, I've found a bug in the SP-GiST box_ops (added in 9.6): some operators (&<, &>, $<|, |&>) have wrong tests in spg_box_quad_inner_consistent(). This obviously leads to incorrect results of a SP-GiST index scan (see tests in the attached patch, their results were taken from a sequential scan). -- Nikita Glukhov Postgres Professional:http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hello, At Mon, 30 Jan 2017 07:12:03 +0300, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote in <9ea5b157-478c-8874-bc9b-050076b7d042@postgrespro.ru> > Working on the tests for new SP-GiST opclasses for polygons and > circles, I've > found a bug in the SP-GiST box_ops (added in 9.6): some operators > (&<, &>, $<|, |&>) have wrong tests in > spg_box_quad_inner_consistent(). > This obviously leads to incorrect results of a SP-GiST index scan (see > tests > in the attached patch, their results were taken from a sequential > scan). Your problem is not necessarily evident for others. Perhaps you have to provide an explanation and/or a test case that describes what is wrong for you so that others can get a clue for this problem. Simpler test is better. The test, | +INSERT INTO quad_box_tbl | + SELECT i, box(point(x, y), point(x + w, y + h)) | + FROM (SELECT i, | + random() * 1000 as x, random() * 1000 as y, | + random() * 20 as w, random() * 20 as h is inserting quad_boxes generated using random numbers then, | +SELECT count(*) FROM quad_box_tbl WHERE b << box '((100,200),(300,500))'; | + count | +------- | + 891 counting them in this way is unstable. Even though this were stable due to a fixed initial, this would be unacceptable, I think. This kind of test should use nonrandom numbers. Even though I don't understand this in depth, the following change seems somewhat wrong in direction. Changing the second argument type seems breaking the basis of the design. | -lower2D(RangeBox *range_box, Range *query) | +lower2D(kiRangeBox *range_box, double query) reagrds, -- Kyotaro Horiguchi NTT Open Source Software Center
On 30.01.2017 12:04, Kyotaro HORIGUCHI wrote: > Hello, > > At Mon, 30 Jan 2017 07:12:03 +0300, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote > in <9ea5b157-478c-8874-bc9b-050076b7d042@postgrespro.ru>: >> Working on the tests for new SP-GiST opclasses for polygons and >> circles, I've found a bug in the SP-GiST box_ops (added in 9.6): some operators >> (&<, &>, $<|, |&>) have wrong tests in spg_box_quad_inner_consistent(). >> This obviously leads to incorrect results of a SP-GiST index scan (see tests >> in the attached patch, their results were taken from a sequential scan). > > Your problem is not necessarily evident for others. Perhaps you > have to provide an explanation and/or a test case that describes > what is wrong for you so that others can get a clue for this > problem. Simpler test is better. The problem is that there are mixed low/high values for x/y axes. For example, overLeft4D() compares x-RangeBox rect_box->range_box_x with y-Range &query->right, and also lower2D() here must use query->high instead of query->low. The corresponding test is very simple: insert 10000 nearly arbitrary boxes and see the difference between results of sequential scan and results of index scan. regression=# CREATE TEMPORARY TABLE quad_box_tbl (b box); regression=# INSERT INTO quad_box_tbl SELECT box(point(x * 10, y * 10), point(x * 10 + 5, y * 10 + 5)) FROM generate_series(1, 100) x, generate_series(1, 100) y; regression=# CREATE INDEX quad_box_tbl_idx ON quad_box_tbl USING spgist(b); regression=# SET enable_seqscan = ON; regression=# SET enable_indexscan = OFF; regression=# SET enable_bitmapscan = OFF; regression=# SELECT count(*) FROM quad_box_tbl WHERE b &< box '((100,200),(300,500))'; count ------- 2900 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &> box '((100,200),(300,500))'; count ------- 9100 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &<| box '((100,200),(300,500))'; count ------- 4900 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b |&> box '((100,200),(300,500))'; count ------- 8100 (1 row) regression=# SET enable_seqscan = OFF; regression=# SET enable_indexscan = ON; regression=# SET enable_bitmapscan = ON; regression=# SELECT count(*) FROM quad_box_tbl WHERE b &< box '((100,200),(300,500))'; count ------- 2430 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &> box '((100,200),(300,500))'; count ------- 6208 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b &<| box '((100,200),(300,500))'; count ------- 1048 (1 row) regression=# SELECT count(*) FROM quad_box_tbl WHERE b |&> box '((100,200),(300,500))'; count ------- 5084 (1 row) > The test, > > | +INSERT INTO quad_box_tbl > | + SELECT i, box(point(x, y), point(x + w, y + h)) > | + FROM (SELECT i, > | + random() * 1000 as x, random() * 1000 as y, > | + random() * 20 as w, random() * 20 as h > > is inserting quad_boxes generated using random numbers then, > > | +SELECT count(*) FROM quad_box_tbl WHERE b << box '((100,200),(300,500))'; > | + count > | +------- > | + 891 > > counting them in this way is unstable. Even though this were > stable due to a fixed initial, this would be unacceptable, I > think. This kind of test should use nonrandom numbers. I have replaced random data in this test with stable uniformly distributed data. I would also like to use existing box data from rect.data that is loaded to slow_emp4000 table in copy.sql test, but box.sql test is executed before copy.sql. > Even though I don't understand this in depth, the following > change seems somewhat wrong in direction. Changing the second > argument type seems breaking the basis of the design. > > | -lower2D(RangeBox *range_box, Range *query) > | +lower2D(RangeBox *range_box, double query) Maybe. I also thought that these changes are quite doubtful, so I decided to replace lowerEqual2D()/higherEqual2D() with overlower2D()/overhigher2D() and not to touch lower2D()/higher2D(). -- Nikita Glukhov Postgres Professional:http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Hello, thank you for the revised patch. The only comment from me is about comments on the new over*2D funnctions. At Mon, 30 Jan 2017 21:12:31 +0300, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote in <4450e7a6-01e7-0fb2-a01e-98fb5405d217@postgrespro.ru> > On 30.01.2017 12:04, Kyotaro HORIGUCHI wrote: > > Hello, > > > > At Mon, 30 Jan 2017 07:12:03 +0300, Nikita Glukhov > > <n.gluhov@postgrespro.ru> wrote > > in <9ea5b157-478c-8874-bc9b-050076b7d042@postgrespro.ru>: > >> Working on the tests for new SP-GiST opclasses for polygons and > >> circles, I've found a bug in the SP-GiST box_ops (added in 9.6): some > >> operators > >> (&<, &>, $<|, |&>) have wrong tests in > >> spg_box_quad_inner_consistent(). > >> This obviously leads to incorrect results of a SP-GiST index scan (see > >> tests > >> in the attached patch, their results were taken from a sequential > >> scan). > > > > Your problem is not necessarily evident for others. Perhaps you > > have to provide an explanation and/or a test case that describes > > what is wrong for you so that others can get a clue for this > > problem. Simpler test is better. > > The problem is that there are mixed low/high values for x/y axes. For > example, > overLeft4D() compares x-RangeBox rect_box->range_box_x with y-Range > &query->right, > and also lower2D() here must use query->high instead of query->low. > > The corresponding test is very simple: insert 10000 nearly arbitrary > boxes and > see the difference between results of sequential scan and results of > index scan. > > regression=# CREATE TEMPORARY TABLE quad_box_tbl (b box); > > regression=# INSERT INTO quad_box_tbl > SELECT box(point(x * 10, y * 10), point(x * 10 + 5, y * 10 + 5)) > FROM generate_series(1, 100) x, > generate_series(1, 100) y; > > regression=# CREATE INDEX quad_box_tbl_idx ON quad_box_tbl USING > spgist(b); Thank you for the explanation and the test with fixed numbers. I clearly understand what is the problem. > regression=# SET enable_seqscan = ON; > regression=# SET enable_indexscan = OFF; > regression=# SET enable_bitmapscan = OFF; > > regression=# SELECT count(*) FROM quad_box_tbl WHERE b &< box > '((100,200),(300,500))'; > count > ------- > 2900 > (1 row) > > regression=# SELECT count(*) FROM quad_box_tbl WHERE b &> box > '((100,200),(300,500))'; > count > ------- > 9100 > (1 row) > > regression=# SELECT count(*) FROM quad_box_tbl WHERE b &<| box > '((100,200),(300,500))'; > count > ------- > 4900 > (1 row) > > regression=# SELECT count(*) FROM quad_box_tbl WHERE b |&> box > '((100,200),(300,500))'; > count > ------- > 8100 > (1 row) > > regression=# SET enable_seqscan = OFF; > regression=# SET enable_indexscan = ON; > regression=# SET enable_bitmapscan = ON; (different result for the same queies) The minimal bad example for the '&<' case is the following. =# create table t (b box); =# create index on t using spgist(b); =# insert into t values ('((215, 305),(210,300))'::box); =# select * from t where b &< '((100,200),(300,400))'::box; b ---------------------(215,305),(210,300) (1 row) =# set enable_seqscan to false; =# select * from t where b &< '((100,200),(300,400))'::box;b --- (0 rows) Index scan excludes the row and it obviously is not a difference comes from error nor tolerance. The current overLeft4D is equivalent to the following. | FPlt(rect_box->range_box_x->left.high, query->right.high) && | FPlt(rect_box->range_box_x->right.high, query->right.high) This is translated into the BOX context as the following. | FPlt(range high(low.x), query->high.y) && | FPlt(range high(high.x), query->high.y) Yes, it is obviously wrong as your report. The code part of the v02 patch looks very good than the previous one. It is exactly what I had in my mind. Thanks. The following comment, > /* Can any range from range_box to be overlower than this argument? */ This might be better to be using the same wording to its users, for example the comment for overLeft4D is the following. | /* Can any rectangle from rect_box does not extend the right of this argument? */ And the documentation uses the following sentsnce, https://www.postgresql.org/docs/current/static/functions-geometry.html | Does not extend to the right of? So I think "Can any range from range_box does not extend the upper of this argument?" is better than the overlower. What do you think? > > counting them in this way is unstable. Even though this were > > stable due to a fixed initial, this would be unacceptable, I > > think. This kind of test should use nonrandom numbers. > > I have replaced random data in this test with stable uniformly > distributed data. > I would also like to use existing box data from rect.data that is > loaded to > slow_emp4000 table in copy.sql test, but box.sql test is executed > before copy.sql. Nonrandom data is good. I'd like to compare the result of corresnponding queries but unfortunately we don't have outer-join on boxes. Anyway the test is enough since it detects this bug. I confirmed other functions in geo_spgist but found no bugs like this. > > Even though I don't understand this in depth, the following > > change seems somewhat wrong in direction. Changing the second > > argument type seems breaking the basis of the design. > > > > | -lower2D(RangeBox *range_box, Range *query) > > | +lower2D(RangeBox *range_box, double query) > > Maybe. I also thought that these changes are quite doubtful, so I > decided to > replace lowerEqual2D()/higherEqual2D() with > overlower2D()/overhigher2D() and > not to touch lower2D()/higher2D(). Thanks, this looks bery good. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
On 31.01.2017 13:04, Kyotaro HORIGUCHI wrote: > The following comment, > >> /* Can any range from range_box to be overlower than this argument? */ > > This might be better to be using the same wording to its users, > for example the comment for overLeft4D is the following. > > | /* Can any rectangle from rect_box does not extend the right of this argument? */ > > And the documentation uses the following sentence, > > https://www.postgresql.org/docs/current/static/functions-geometry.html > | Does not extend to the right of? > > So I think "Can any range from range_box does not extend the > upper of this argument?" is better than the overlower. What do > you think? I think "does not extend the upper" is better than unclear "overlower" too. So I've corrected this comment in the attached v03 patch. > I confirmed other functions in geo_spgist but found no bugs like this. I found no bugs in other functions too. > The minimal bad example for the '&<' case is the following. > > =# create table t (b box); > =# create index on t using spgist(b); > =# insert into t values ('((215, 305),(210,300))'::box); > =# select * from t where b &< '((100,200),(300,400))'::box; > b > --------------------- > (215,305),(210,300) > (1 row) > > =# set enable_seqscan to false; > =# select * from t where b &< '((100,200),(300,400))'::box; > b > --- > (0 rows) I don't know how you were able to reproduce this bug in spg_box_quad_inner_consistent() using only one box because index tree must have at least one inner node (or 157 boxes) in order to spg_box_quad_inner_consistent() began to be called. That's why existing tests for box_ops didn't detect this bug: box_temp table has only 56 rows. -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
At Tue, 31 Jan 2017 14:38:39 +0300, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote in <1622dc9f-cecf-cee3-b71e-b2bf34649053@postgrespro.ru> > On 31.01.2017 13:04, Kyotaro HORIGUCHI wrote: > > The following comment, > > > >> /* Can any range from range_box to be overlower than this argument? */ > > > > This might be better to be using the same wording to its users, > > for example the comment for overLeft4D is the following. > > > > | /* Can any rectangle from rect_box does not extend the right of this > > | argument? */ > > > > And the documentation uses the following sentence, > > > > https://www.postgresql.org/docs/current/static/functions-geometry.html > > | Does not extend to the right of? > > > > So I think "Can any range from range_box does not extend the > > upper of this argument?" is better than the overlower. What do > > you think? > > I think "does not extend the upper" is better than unclear "overlower" > too. > So I've corrected this comment in the attached v03 patch. Thank you. I think this patch is already in a good shape. Let's wait for a while for some commiter to pick up this. If you're afraid that this might be forgotten, it might be better to register this as an entry in the next commit fest. > > I confirmed other functions in geo_spgist but found no bugs like this. > > I found no bugs in other functions too. > > > > The minimal bad example for the '&<' case is the following. > > > > =# create table t (b box); > > =# create index on t using spgist(b); > > =# insert into t values ('((215, 305),(210,300))'::box); > > =# select * from t where b &< '((100,200),(300,400))'::box; > > b > > --------------------- > > (215,305),(210,300) > > (1 row) > > > > =# set enable_seqscan to false; > > =# select * from t where b &< '((100,200),(300,400))'::box; > > b > > --- > > (0 rows) > > I don't know how you were able to reproduce this bug in > spg_box_quad_inner_consistent() > using only one box because index tree must have at least one inner > node (or 157 boxes) > in order to spg_box_quad_inner_consistent() began to be called. > That's why existing > tests for box_ops didn't detect this bug: box_temp table has only 56 > rows. GiST seems to split the space even for the first tuple. I saw spg_box_quad_inner_consistent called several times on SELECT and fortunately the scan for the tuple going into wrong node. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
> I think this patch is already in a good shape. I am sorry for introducing this bug. This fix looks good to me as well.
On Wed, Feb 1, 2017 at 2:43 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
I checked this patch too. And it seems good to me as well.
> I think this patch is already in a good shape.
I am sorry for introducing this bug. This fix looks good to me as well.
Should we mark it as "ready for committer"?
The Russian Postgres Company
Hello, On Thu, March 9, 2017 9:04 am, Alexander Korotkov wrote: > On Wed, Feb 1, 2017 at 2:43 PM, Emre Hasegeli <emre@hasegeli.com> wrote: > >> > I think this patch is already in a good shape. >> >> I am sorry for introducing this bug. This fix looks good to me as well. > > > I checked this patch too. And it seems good to me as well. > Should we mark it as "ready for committer"? I can't comment on the code, but the grammar on the comments caught my eye: > +/* Can any range from range_box does not extend higher than this argument? */ > >+static bool >+overLower2D(RangeBox *range_box, Range *query) >+{ >+ return FPle(range_box->left.low, query->high) && >+ FPle(range_box->right.low, query->high); >+} The sentence sounds quite garbled in English. I'm not entirely sure what it should be, but given the comment below "/* Can any range from range_box to be higher than this argument? */" maybe something like: /* Does any range from range_box extend to the right side of the query? */ If used, an analog wording should be used for overHigher2D's comment like: /* Does any range from range_box extend to the left side of the query? */ Also: /* Can any range from range_box to be higher than this argument? */ should be: /* Can any range from range_box be higher than this argument? */ Another question: Does it make sense to add the "minimal bad example for the '&<' case" as test case, too? After all, it should pass the test after the patch. Bets regards, Tels
On 10.03.2017 02:13, Tels wrote: > I can't comment on the code, but the grammar on the comments caught my eye: >> +/* Can any range from range_box does not extend higher than this argument? */ >> +static bool >> +overLower2D(RangeBox *range_box, Range *query) >> +{ >> + return FPle(range_box->left.low, query->high) && >> + FPle(range_box->right.low, query->high); >> +} > The sentence sounds quite garbled in English. I'm not entirely sure what > it should be, but given the comment below "/* Can any range from range_box > to be higher than this argument? */" maybe something like: > > /* Does any range from range_box extend to the right side of the query? */ > > If used, an analog wording should be used for overHigher2D's comment like: > > /* Does any range from range_box extend to the left side of the query? */ I've changed comments as you proposed, but I've added missing "not" and left "Can": /* Can any range from range_box not extend to the right side of the query? */ /* Can any range from range_box not extend to the left side of the query? */ See also comments on overhigher and overlower operators from documentation: &< Does not extend to the right of? &> Does not extend to the left of? > Another question: Does it make sense to add the "minimal bad example for > the '&<' case" as test case, too? After all, it should pass the test after > the patch. Yes, it will make sense, but the Kyotaro's test case doesn't work for me and I still don't know how to force SP-GiST to create inner leaves without inserting hundreds of rows. -- Nikita Glukhov Postgres Professional:http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
At Fri, 10 Mar 2017 12:15:52 +0300, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote in <48f6934b-b994-4aa2-b6ad-aaa4f5a12877@postgrespro.ru> > On 10.03.2017 02:13, Tels wrote: > > > I can't comment on the code, but the grammar on the comments caught my > > eye: > >> +/* Can any range from range_box does not extend higher than this > >> argument? */ > >> +static bool > >> +overLower2D(RangeBox *range_box, Range *query) > >> +{ > >> + return FPle(range_box->left.low, query->high) && > >> + FPle(range_box->right.low, query->high); > >> +} > > The sentence sounds quite garbled in English. I'm not entirely sure > > what > > it should be, but given the comment below "/* Can any range from > > range_box > > to be higher than this argument? */" maybe something like: > > > > /* Does any range from range_box extend to the right side of the > > query? */ > > > > If used, an analog wording should be used for overHigher2D's comment > > like: > > > > /* Does any range from range_box extend to the left side of the query? > > */ > > I've changed comments as you proposed, but I've added missing "not" > and left "Can": > > /* Can any range from range_box not extend to the right side of the > query? */ > /* Can any range from range_box not extend to the left side of the > query? */ > > See also comments on overhigher and overlower operators from > documentation: > > &< Does not extend to the right of? > &> Does not extend to the left of? > > > Another question: Does it make sense to add the "minimal bad example > > for > > the '&<' case" as test case, too? After all, it should pass the test > > after > > the patch. > > Yes, it will make sense, but the Kyotaro's test case doesn't work for > me and > I still don't know how to force SP-GiST to create inner leaves without > inserting hundreds of rows. I admit that the case is quite unstable, or environment-dependent and the minimal bad case no longer replays even for me. On the other hand, inserting some hundreds of boxes makes it more stable than only one box. I'm not sure that it is enough but it seems to be the best we can. As I mentioned previously, box cannot be used as the key for outer join so comparing one-by-one is out of hand of regression test. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: not tested As I can see, this bugfix was already discussed and reviewed. All mentioned issues are fixed in the latest version of the patch So I mark it "Ready for committer". Thank you for fixing it! The new status of this patch is: Ready for Committer
Thank you, pushed. I just make test table permanent. Anastasia Lubennikova wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, passed > Implements feature: tested, passed > Spec compliant: tested, passed > Documentation: not tested > > As I can see, this bugfix was already discussed and reviewed. > All mentioned issues are fixed in the latest version of the patch > So I mark it "Ready for committer". > Thank you for fixing it! > > The new status of this patch is: Ready for Committer > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/