Thread: RTREE on points
Am I missing the point (no pun intended ;-) of RTREE indices? I was expecting a "point_ops" opclass or similar... [7.1 on RedHat 6.2] SELECT am.amname AS acc_name, opc.opcname AS ops_name, COUNT(*) FROM pg_am am, pg_amop amop, pg_opclassopc WHERE amop.amopid = am.oid AND amop.amopclaid = opc.oid AND am.amname = 'rtree' GROUP BY am.amname,opc.opcname ORDER BY acc_name, ops_name; acc_name | ops_name | count ----------+------------+-------rtree | bigbox_ops | 8rtree | box_ops | 8rtree | poly_ops | 8 (3 rows) Surely the most natural application of an RTREE is to index points, as well as boxes and polygons. E.g. CREATE TABLE "nodes" ( "node" point, "node_name" character varying(30) ); CREATE INSERT INTO nodes VALUES ('(1,1)', 'a'); INSERT 207372 1 INSERT INTO nodes VALUES ('(1,2)', 'b'); INSERT 207373 1 INSERT INTO nodes VALUES ('(3,2)', 'c'); INSERT 207374 1 INSERT INTO nodes VALUES ('(5,4)', 'd'); INSERT 207375 1 INSERT INTO nodes VALUES ('(7,8)', 'e'); INSERT 207376 1 INSERT INTO nodes VALUES ('(11,10)', 'f'); INSERT 207377 1 INSERT INTO nodes VALUES ('(101,11)', 'g'); INSERT 207378 1 explain select * from nodes where node @ '((1,1),(3,3))'::box; NOTICE: QUERY PLAN: Seq Scan on nodes (cost=0.00..22.50 rows=500 width=28) So create an RTREE index to help...but predictably: CREATE INDEX test_rtree ON nodes USING RTREE (node); ERROR: DefineIndex: type point has no default operator class I can do something like: CREATE INDEX test_rtree ON nodes USING RTREE (box(node,node)); CREATE but then: explain select * from nodes where node @ '((1,1),(3,3))'::box; NOTICE: QUERY PLAN: Seq Scan on nodes (cost=0.00..1.09 rows=4 width=28) and even: explain select * from nodes where box(node,node) @ '((1,1),(3,3))'::box; NOTICE: QUERY PLAN: Seq Scan on nodes (cost=0.00..1.10 rows=1 width=28) Thanks for any help Julian Scarfe
Julian Scarfe wrote: > > explain select * from nodes where box(node,node) @ '((1,1),(3,3))'::box; > NOTICE: QUERY PLAN: > Seq Scan on nodes (cost=0.00..1.10 rows=1 width=28) > this should work, assuming you have enough points to make a difference (in the optimizer's mind, at least). the optimizer still doesn't do a great job of knowing when it's best to use an index, although, in your sample, there's no way it would ever be cheaper to use an index. there's simply not enough data there. you can test to see if an index can be used by a query by shutting off the sequential scans (set enable_seqscan=off) and retrying the query. essentially, this forces it to use an index scan if at all possible. -- Jeff Hoffmann PropertyKey.com
Julian Scarfe wrote: > > It hadn't occured to me that the index would simply not be used and I'm > grateful for the pointer to the appropriate variable. i wouldn't recommend turning off sequential scans for day-to-day usage, but it certainly can be useful for debugging and testing. if you have specific queries that you need optimized, you can do that, but the whole point of cost estimates is to give a good estimate of what a normal query would return, so if you have a lot of ad-hoc queries, it's probably better to just trust the system. > Nevertheless, wouldn't... > > CREATE INDEX test_rtree ON nodes USING RTREE (node); > (which fails) > > ...be a lot simpler than... > > CREATE INDEX test_rtree ON nodes USING RTREE (box(node,node)); > (which succeeds, as above) > > ? yes, it does seem like a little more work, but there doesn't seem to be a lot of usage of the geometric functions by the developers to look at missing features -- they're mostly just reactive to problems. i really have never dug into tweaking access methods to get this to work, but i would imagine it's not that hard to implement. -- Jeff Hoffmann PropertyKey.com
Jeff Hoffmann <jeff@propertykey.com> writes: > yes, it does seem like a little more work, but there doesn't seem to be > a lot of usage of the geometric functions by the developers to look at > missing features -- they're mostly just reactive to problems. Jeff's correct, none of the core developers have much time to spend on adding features to the geometric datatypes. If someone else wants to step up to the plate, though, contributions are welcome ;-) The procedure for adding a new index opclass is somewhat documented for btree opclasses. To add an rtree opclass you'd be adding a different set of operators and support functions, which set isn't documented anywhere that I know of; you'd have to look at the existing examples to figure out what is needed. BTW, you should also look at the GIST stuff and figure out whether it might not be better to develop a GIST opclass instead of rtree. In the long run I suspect GIST will be better supported than rtree, since it's more general. regards, tom lane
Julian Scarfe wrote: > > > > explain select * from nodes where box(node,node) @ '((1,1),(3,3))'::box; > > NOTICE: QUERY PLAN: > > Seq Scan on nodes (cost=0.00..1.10 rows=1 width=28) From: "Jeff Hoffmann" <jeff@propertykey.com> > this should work, assuming you have enough points to make a difference > (in the optimizer's mind, at least). the optimizer still doesn't do a > great job of knowing when it's best to use an index, although, in your > sample, there's no way it would ever be cheaper to use an index. > there's simply not enough data there. you can test to see if an index > can be used by a query by shutting off the sequential scans (set > enable_seqscan=off) and retrying the query. essentially, this forces it > to use an index scan if at all possible. And indeed it does, thank you, Jeff: # set enable_seqscan=off; SET VARIABLE # explain select * from nodes where box(node,node) @ '((1,1),(3,3))'::box; NOTICE: QUERY PLAN: Index Scan using test_rtree on nodes (cost=0.00..2.02 rows=1 width=28) It hadn't occured to me that the index would simply not be used and I'm grateful for the pointer to the appropriate variable. Nevertheless, wouldn't... CREATE INDEX test_rtree ON nodes USING RTREE (node); (which fails) ...be a lot simpler than... CREATE INDEX test_rtree ON nodes USING RTREE (box(node,node)); (which succeeds, as above) ? The latter feels contorted and possibly inefficient. After all, I don't do...: CREATE TABLE "nodes" ( "node" point, "node_name" character varying(30) ); INSERT INTO nodes VALUES ('(1,1)', 'a'); INSERT INTO nodes VALUES ('(1,2)', 'b'); INSERT INTO nodes VALUES ('(3,2)', 'c'); INSERT INTO nodes VALUES ('(5,4)', 'd'); INSERT INTO nodes VALUES ('(7,8)', 'e'); INSERT INTO nodes VALUES ('(11,10)', 'f'); INSERT INTO nodes VALUES ('(101,11)', 'g'); CREATE INDEX test_btree ON nodes USING BTREE (textcat(node_name,node_name)); ...if I want to index by name? (even though in principle it would work) Thanks for any guidance. Julian Scarfe
Tom Lane wrote: > BTW, you should also look at the GIST stuff and figure out whether > it might not be better to develop a GIST opclass instead of rtree. > In the long run I suspect GIST will be better supported than rtree, > since it's more general. > > regards, tom lane are there any built-in GIST opclasses? i didn't see any when looking through the system tables. weren't there things like gist_box_ops & gist_poly_ops at one time? i know there are a couple of GiST examples in contrib (seg, cube & intarray), but i thought there used to be at least a gist_box_ops. or was that another contrib item that got dropped? -- Jeff Hoffmann PropertyKey.com
Tom Lane wrote: > > I don't recall any such thing having been removed, but it does seem > peculiar that there are no GIST opclasses in the standard distribution. > How the heck did the GIST index code get developed/tested without some > opclasses? doing some digging at berkeley, i found the original pggist patch file that created the gist access method & gist_box_ops opclass (among others). i'm assuming that patch was the basis for what was originally introduced, so i don't know why it didn't get included with everything else. it looks like there are a lot of calls to internal postgresql box comparison functions that would need to get converted to the new calling convention, but it should be pretty straightforward to get it to work with a recent version of postgresql. it does seem pretty silly to have it in there if you don't have any built-in way of using it, if for no other reason than to be able to test if the feature even works. -- Jeff Hoffmann PropertyKey.com
Jeff Hoffmann <jeff@propertykey.com> writes: > I know there are a couple of GiST examples in contrib (seg, cube & > intarray), but i thought there used to be at least a gist_box_ops. I don't recall any such thing having been removed, but it does seem peculiar that there are no GIST opclasses in the standard distribution. How the heck did the GIST index code get developed/tested without some opclasses? Anyone remember the history? AFAICT from the CVS logs, GIST was added to the tree in mid 1996, but no opclasses for it were added at the time. I'd go digging in the maillist archives, but www.postgresql.org is too friggin' slow at the moment ... regards, tom lane
Jeff Hoffmann <jeff@propertykey.com> writes: > Tom Lane wrote: >> How the heck did the GIST index code get developed/tested without some >> opclasses? > doing some digging at berkeley, i found the original pggist patch file > that created the gist access method & gist_box_ops opclass (among > others). i'm assuming that patch was the basis for what was originally > introduced, so i don't know why it didn't get included with everything > else. The CVS logs show there was some confusion about applying the whole of the pggist patch, so it's possible that this was an unintentional omission. > it looks like there are a lot of calls to internal postgresql box > comparison functions that would need to get converted to the new calling > convention, but it should be pretty straightforward to get it to work > with a recent version of postgresql. it does seem pretty silly to have > it in there if you don't have any built-in way of using it, if for no > other reason than to be able to test if the feature even works. Yes. I had been thinking of migrating one or more of the new contrib GIST opclasses into the main distribution for 7.2, if only to be able to create a regress test for GIST. Updating this old Berkeley code would be another path to create/extend a GIST regress test. If you (or someone else) have time to do that, it'd be great. regards, tom lane
GiST is great ! You may look at http://www.sai.msu.su/~megera/postgres/gist/ for GiST implementation of RTree - it could be not compiled with 7.1 release due to some api changes, but it's not difficult to do. If somebody want it I could contribute it to contrib area. Tom, what we need to do to improve GiST support is to provide selectivity information of gist indices to optimizer. I'll post separate message about this problem Oleg On Tue, 17 Apr 2001, Jeff Hoffmann wrote: > Tom Lane wrote: > > > > I don't recall any such thing having been removed, but it does seem > > peculiar that there are no GIST opclasses in the standard distribution. > > How the heck did the GIST index code get developed/tested without some > > opclasses? > > doing some digging at berkeley, i found the original pggist patch file > that created the gist access method & gist_box_ops opclass (among > others). i'm assuming that patch was the basis for what was originally > introduced, so i don't know why it didn't get included with everything > else. it looks like there are a lot of calls to internal postgresql box > comparison functions that would need to get converted to the new calling > convention, but it should be pretty straightforward to get it to work > with a recent version of postgresql. it does seem pretty silly to have > it in there if you don't have any built-in way of using it, if for no > other reason than to be able to test if the feature even works. > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov wrote: > > GiST is great ! > > You may look at http://www.sai.msu.su/~megera/postgres/gist/ > for GiST implementation of RTree - it could be not compiled with 7.1 > release due to some api changes, but it's not difficult to do. it looks like i just wasted a good couple of hours trying to convert the gist_box_ops. it did help find the pointer problem i was having because i'm still not up to speed on the new function calling conventions, though... > If somebody want it I could contribute it to contrib area. i'm definitely interested. i'm going to play with it & if oleg's claim holds about index insertion time holds, i can definitely see myself moving to it over the built in rtree. anything that can cut down the hours of index creation time would be great. also, it seems that it'd be a good choice for inclusion in the standard distribution because it'd be easy to test -- you already have to run rtree tests anyway, you can just duplicate them with gist & gist_box_ops. -- Jeff Hoffmann PropertyKey.com
On Tue, 17 Apr 2001, Jeff Hoffmann wrote: > Oleg Bartunov wrote: > > > > GiST is great ! > > > > You may look at http://www.sai.msu.su/~megera/postgres/gist/ > > for GiST implementation of RTree - it could be not compiled with 7.1 > > release due to some api changes, but it's not difficult to do. > > it looks like i just wasted a good couple of hours trying to convert the > gist_box_ops. it did help find the pointer problem i was having because > i'm still not up to speed on the new function calling conventions, > though... > :-) > > If somebody want it I could contribute it to contrib area. > > i'm definitely interested. i'm going to play with it & if oleg's claim > holds about index insertion time holds, i can definitely see myself > moving to it over the built in rtree. anything that can cut down the > hours of index creation time would be great. also, it seems that it'd > be a good choice for inclusion in the standard distribution because it'd > be easy to test -- you already have to run rtree tests anyway, you can > just duplicate them with gist & gist_box_ops. > nice you noticed that ! We'll update contrib-rtree_box_gist for 7.1 release in a few days. > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Jeff, I checked the archive and it works with 7.1 release we've implemented only several functions for box, so you may use them as example for remaining types. Regards, Oleg On Wed, 18 Apr 2001, Oleg Bartunov wrote: > On Tue, 17 Apr 2001, Jeff Hoffmann wrote: > > > Oleg Bartunov wrote: > > > > > > GiST is great ! > > > > > > You may look at http://www.sai.msu.su/~megera/postgres/gist/ > > > for GiST implementation of RTree - it could be not compiled with 7.1 > > > release due to some api changes, but it's not difficult to do. > > > > it looks like i just wasted a good couple of hours trying to convert the > > gist_box_ops. it did help find the pointer problem i was having because > > i'm still not up to speed on the new function calling conventions, > > though... > > > > :-) > > > > If somebody want it I could contribute it to contrib area. > > > > i'm definitely interested. i'm going to play with it & if oleg's claim > > holds about index insertion time holds, i can definitely see myself > > moving to it over the built in rtree. anything that can cut down the > > hours of index creation time would be great. also, it seems that it'd > > be a good choice for inclusion in the standard distribution because it'd > > be easy to test -- you already have to run rtree tests anyway, you can > > just duplicate them with gist & gist_box_ops. > > > > nice you noticed that ! We'll update contrib-rtree_box_gist for 7.1 release > in a few days. > > > > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://www.postgresql.org/search.mpl > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
I believe Marc install the GIST code into the tree long ago. > Jeff Hoffmann <jeff@propertykey.com> writes: > > I know there are a couple of GiST examples in contrib (seg, cube & > > intarray), but i thought there used to be at least a gist_box_ops. > > I don't recall any such thing having been removed, but it does seem > peculiar that there are no GIST opclasses in the standard distribution. > How the heck did the GIST index code get developed/tested without some > opclasses? > > Anyone remember the history? AFAICT from the CVS logs, GIST was added > to the tree in mid 1996, but no opclasses for it were added at the time. > I'd go digging in the maillist archives, but www.postgresql.org is too > friggin' slow at the moment ... > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026