Thread: Add missing operator <->(box, point)
Hi, hackers. Attached patches add missing distance operator <->(box, point). We already have reverse operator <->(point, box), but it can't be used for kNN search in GiST and SP-GiST. GiST and SP-GiST now support kNN searches over more complex polygons and circles, but do not support more simple boxes, which seems to be inconsistent. Description of the patches: 1. Add function dist_pb(box, point) and operator <->. 2. Add <-> to GiST box_ops. Extracted gist_box_distance_helper() common for gist_box_distance() and gist_bbox_distance(). 3. Add <-> to SP-GiST. Changed only catalog and tests. Box case is already checked in spg_box_quad_leaf_consistent(): out->recheckDistances = distfnoid == F_DIST_POLYP; -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Hello Nikita, > Attached patches add missing distance operator <->(box, point). Indeed. > We already have reverse operator <->(point, box), but it can't be used > for kNN search in GiST and SP-GiST. GiST and SP-GiST now support kNN > searches over more complex polygons and circles, but do not support more > simple boxes, which seems to be inconsistent. > > Description of the patches: > 1. Add function dist_pb(box, point) and operator <->. About this first patch: applies cleanly, compiles, "make check" ok. No doc changes, but this was expected to work in the first place, according to the documention. About the test, I'd suggest to name the result columns, eg "pt to box dist" and "box to pt dist", otherwise why all is repeated is unclear. I notice that other distance tests do not test for commutativity. Are they also not implemented, or just not tested? If not implemented, I'd suggest to add them in the same batch. If not tested, maybe the patch should do as others, or maybe given the trivial implementation there should just be one test per commutted operator for coverage. ISTM that the committer would need to "bump the catalog revision number" because it adds new functions & operators. -- Fabien.
[ warning, drive-by comment ahead ] Fabien COELHO <coelho@cri.ensmp.fr> writes: > I notice that other distance tests do not test for commutativity. Are they > also not implemented, or just not tested? If not implemented, I'd suggest > to add them in the same batch. Yeah ... just looking at operators named <->, I see regression=# select oid, oid::regoperator, oprcom, oprcode from pg_operator where oprname = '<->'; oid | oid | oprcom | oprcode ------+----------------------+--------+--------------------------- 517 | <->(point,point) | 517 | point_distance 613 | <->(point,line) | 0 | dist_pl 614 | <->(point,lseg) | 0 | dist_ps 615 | <->(point,box) | 0 | dist_pb 616 | <->(lseg,line) | 0 | dist_sl 617 | <->(lseg,box) | 0 | dist_sb 618 | <->(point,path) | 0 | dist_ppath 706 | <->(box,box) | 706 | box_distance 707 | <->(path,path) | 707 | path_distance 708 | <->(line,line) | 708 | line_distance 709 | <->(lseg,lseg) | 709 | lseg_distance 712 | <->(polygon,polygon) | 712 | poly_distance 1520 | <->(circle,circle) | 1520 | circle_distance 1522 | <->(point,circle) | 3291 | dist_pc 3291 | <->(circle,point) | 1522 | dist_cpoint 3276 | <->(point,polygon) | 3289 | dist_ppoly 3289 | <->(polygon,point) | 3276 | dist_polyp 1523 | <->(circle,polygon) | 0 | dist_cpoly 1524 | <->(line,box) | 0 | dist_lb 5005 | <->(tsquery,tsquery) | 0 | pg_catalog.tsquery_phrase (20 rows) It's not clear to me why to be particularly more excited about <->(box, point) than about the other missing cases here. regards, tom lane
Attached 2nd version of the patches.
On 20.04.2019 16:41, Fabien COELHO wrote:
About the test, I'd suggest to name the result columns, eg "pt to box dist" and "box to pt dist", otherwise why all is repeated is unclear.
Fixed.
On 02.07.2019 7:01, Tom Lane wrote:
[ warning, drive-by comment ahead ] Fabien COELHO <coelho@cri.ensmp.fr> writes:I notice that other distance tests do not test for commutativity. Are they also not implemented, or just not tested? If not implemented, I'd suggest to add them in the same batch.Yeah ... just looking at operators named <->, I see regression=# select oid, oid::regoperator, oprcom, oprcode from pg_operator where oprname = '<->';oid | oid | oprcom | oprcode ------+----------------------+--------+--------------------------- 517 | <->(point,point) | 517 | point_distance 613 | <->(point,line) | 0 | dist_pl 614 | <->(point,lseg) | 0 | dist_ps 615 | <->(point,box) | 0 | dist_pb 616 | <->(lseg,line) | 0 | dist_sl 617 | <->(lseg,box) | 0 | dist_sb 618 | <->(point,path) | 0 | dist_ppath 706 | <->(box,box) | 706 | box_distance 707 | <->(path,path) | 707 | path_distance 708 | <->(line,line) | 708 | line_distance 709 | <->(lseg,lseg) | 709 | lseg_distance 712 | <->(polygon,polygon) | 712 | poly_distance1520 | <->(circle,circle) | 1520 | circle_distance1522 | <->(point,circle) | 3291 | dist_pc3291 | <->(circle,point) | 1522 | dist_cpoint3276 | <->(point,polygon) | 3289 | dist_ppoly3289 | <->(polygon,point) | 3276 | dist_polyp1523 | <->(circle,polygon) | 0 | dist_cpoly1524 | <->(line,box) | 0 | dist_lb5005 | <->(tsquery,tsquery) | 0 | pg_catalog.tsquery_phrase (20 rows) It's not clear to me why to be particularly more excited about <->(box, point) than about the other missing cases here. regards, tom lane
The original goal was to add support of ordering by distance to point to all geometric opclasses. As you can see, GiST and SP-GiST box_ops has no distance operator while more complex circle_ops and poly_ops have it:
SELECT amname, opcname, amopopr::regoperator AS dist_opr FROM pg_opclass LEFT JOIN pg_amop ON amopfamily = opcfamily AND amoppurpose = 'o', pg_am, pg_type WHERE opcmethod = pg_am.oid AND opcintype = pg_type.oid AND typcategory = 'G' ORDER BY 1, 2; amname | opcname | dist_opr --------+-------------------+--------------------brin | box_inclusion_ops | gist | box_ops | gist | circle_ops | <->(circle,point)gist | point_ops | <->(point,point)gist | poly_ops | <->(polygon,point)spgist | box_ops | spgist | kd_point_ops | <->(point,point)spgist | poly_ops | <->(polygon,point)spgist | quad_point_ops | <->(point,point) (9 rows)
We could use commuted "const <-> var" operators for kNN searches, but the current implementation requires the existence of "var <-> const" operators, and order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op() at /src/backend/optimizer/path/indxpath.c). -- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Tue, Jul 2, 2019 at 9:19 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > We could use commuted "const <-> var" operators for kNN searches, but the > current implementation requires the existence of "var <-> const" operators, and > order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op() > at /src/backend/optimizer/path/indxpath.c). But probably it's still worth to just add commutator for every <-> operator and close this question. Otherwise, it may arise again once we want to add some more kNN support to opclasses or something. On the other hand, are we already going to limit oid consumption? ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Alexander Korotkov <a.korotkov@postgrespro.ru> writes: > On Tue, Jul 2, 2019 at 9:19 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: >> We could use commuted "const <-> var" operators for kNN searches, but the >> current implementation requires the existence of "var <-> const" operators, and >> order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op() >> at /src/backend/optimizer/path/indxpath.c). > But probably it's still worth to just add commutator for every <-> > operator and close this question. Yeah, I was just thinking that it was weird not to have the commutator operators, independently of indexing considerations. Seems like a usability thing. regards, tom lane
On Mon, Mar 11, 2019 at 2:49 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > 2. Add <-> to GiST box_ops. > Extracted gist_box_distance_helper() common for gist_box_distance() and > gist_bbox_distance(). For me it doesn't look worth having two distinct functions gist_box_distance_helper() and gist_bbox_distance(). What about having just one and leave responsibility for recheck flag to the caller? > 3. Add <-> to SP-GiST. > Changed only catalog and tests. Box case is already checked in > spg_box_quad_leaf_consistent(): > out->recheckDistances = distfnoid == F_DIST_POLYP; So, it seems to be fix of oversight in 2a6368343ff4. But assuming fixing this requires catalog changes, we shouldn't backpatch this. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attached 3rd version of the patches.
On 02.07.2019 21:55, Alexander Korotkov wrote:
On Tue, Jul 2, 2019 at 9:19 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:We could use commuted "const <-> var" operators for kNN searches, but the current implementation requires the existence of "var <-> const" operators, and order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op() at /src/backend/optimizer/path/indxpath.c).But probably it's still worth to just add commutator for every <-> operator and close this question. Otherwise, it may arise again once we want to add some more kNN support to opclasses or something. On the other hand, are we already going to limit oid consumption?
All missing distance operators were added to the first patch.
On 08.07.2019 18:22, Alexander Korotkov wrote:
On Mon, Mar 11, 2019 at 2:49 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:2. Add <-> to GiST box_ops. Extracted gist_box_distance_helper() common for gist_box_distance() and gist_bbox_distance().For me it doesn't look worth having two distinct functions gist_box_distance_helper() and gist_bbox_distance(). What about having just one and leave responsibility for recheck flag to the caller?
gist_bbox_distance() was removed. But maybe it would be better to replace two identical functions gist_circle_distance() and gist_poly_distance() with the single gist_bbox_distance()?
3. Add <-> to SP-GiST. Changed only catalog and tests. Box case is already checked in spg_box_quad_leaf_consistent(): out->recheckDistances = distfnoid == F_DIST_POLYP;So, it seems to be fix of oversight in 2a6368343ff4. But assuming fixing this requires catalog changes, we shouldn't backpatch this.
-- Nikita Glukhov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Mon, Jul 8, 2019 at 11:39 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > On 08.07.2019 18:22, Alexander Korotkov wrote: > For me it doesn't look worth having two distinct functions > gist_box_distance_helper() and gist_bbox_distance(). What about > having just one and leave responsibility for recheck flag to the > caller? > > gist_bbox_distance() was removed. OK. > But maybe it would be better to replace two identical functions > gist_circle_distance() and gist_poly_distance() with the single > gist_bbox_distance()? Sounds reasonable to me. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Tue, Jul 9, 2019 at 12:03 AM Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > On Mon, Jul 8, 2019 at 11:39 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote: > > On 08.07.2019 18:22, Alexander Korotkov wrote: > > For me it doesn't look worth having two distinct functions > > gist_box_distance_helper() and gist_bbox_distance(). What about > > having just one and leave responsibility for recheck flag to the > > caller? > > > > gist_bbox_distance() was removed. > > OK. > > > But maybe it would be better to replace two identical functions > > gist_circle_distance() and gist_poly_distance() with the single > > gist_bbox_distance()? > > Sounds reasonable to me. However, gist_poly_distance() and gist_circle_distance() have different signatures. Having same internal function to be corresponding to more than one catalog function cause troubles in sanity checks. So, let's leave it as it is. Revised patchset is attached. It contains commit messages as well as minor editorialization. I'm going to push this if no objections. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company