Re: GSoC 2015: SP-GIST for geometrical objects - Mailing list pgsql-hackers
From | Arthur Silva |
---|---|
Subject | Re: GSoC 2015: SP-GIST for geometrical objects |
Date | |
Msg-id | CAO_YK0UgM54LZ-Uq7ucRudGLM2t9APHHUO-mKYWSv15OZaf=uA@mail.gmail.com Whole thread Raw |
In response to | Re[2]: [HACKERS] GSoC 2015: SP-GIST for geometrical objects (Dima Ivanovskiy <dima-iv@mail.ru>) |
List | pgsql-hackers |
<p dir="ltr"><br /> On Mar 27, 2015 6:41 PM, "Dima Ivanovskiy" <<a href="mailto:dima-iv@mail.ru">dima-iv@mail.ru</a>>wrote:<br /> ><br /> ><br /> >> On Mar 27, 2015 11:08 AM,"Dima Ivanovskiy" <<a href="mailto:dima-iv@mail.ru">dima-iv@mail.ru</a>> wrote:<br /> >> ><br /> >>> Hello, I am Dmitrii, student of Moscow Institute of Physics and Technology<br /> >> ><br /> >>> Abstract:<br /> >> ><br /> >> > I chose project "Indexing prolonged geometrical objects (i.e.boxes, circles, polygons, not points) with SP-GiST by mapping to 4d-space". <br /> >> > According to the presentation<br/> >> > <a href="https://www.pgcon.org/2011/schedule/attachments/197_pgcon-2011.pdf">https://www.pgcon.org/2011/schedule/attachments/197_pgcon-2011.pdf</a><br />>> > SP-GIST 3 times faster than GiST in some cases. But GIST supports geometrical data types: <br /> >>> box, circle, polygon with operators: && &> &< &<| >> << <<| <@@> @ |&> |>> ~ ~=<br /> >> > Popular spatial extension PostGIS doesn't include SP-GIST, buthas a lot of geometrical features.<br /> >> ><br /> >> > Project details:<br /> >> ><br />>> > After meeting with Alexander Korotkov, I wrote some plan. <br /> >> > Using of K-D-tree and Quadtreein building index for geometrical data types can increase speed of search in some cases.<br /> >> > Themain idea is representing 2-D geometrical objects in their bounding box. Set of 2-D boxes is 4-D space. <br /> >>> New _ops will work with points from 4-D space, for example kd_box_ops, quad_circle_ops and will support allgeometrical operators. <br /> >> > After conversion object to their bounding box algo has set of tuples (x1,y1, x2, y2). <br /> >> > Our goal is separate this space the most equally. If we talk about K-D-tree, on firststep K-D-tree algorithm will split space in 2 parts by the first coordinate, in next step by the second coordinate etc.,after 4-th coordinate we repeat this procedure. <br /> >> > At the end we have index at geometrical objectsand use traversal tree for every search operator. <br /> >> ><br /> >> > Postgresql has alreadyhas realization ideas of MBR in gist/gistproc.c. So I will transfer this realization to other type of tree.<br />>> ><br /> >> > Of cource, I assume that SP-GIST can be not the best decision of this problem. So aftertesting this clear methods, I will try to find more effective way. Maybe with using combination of different spatialtree structures.<br /> >> ><br /> >> > Project Schedule:<br /> >> ><br /> >> >until May 25<br /> >> ><br /> >> > Read documentation and source code, clarify details of implementation.<br/> >> ><br /> >> > 1st month<br /> >> ><br /> >> > Implement new '_ops'with all geometrical operators for box, circle, polygon<br /> >> ><br /> >> > 2nd month<br /> >>><br /> >> > Research new methods for increase speed of geometrical query<br /> >> ><br /> >>> 3rd month<br /> >> ><br /> >> > Final refactoring, testing and submitting a patch.<br /> >>><br /> >> ><br /> >> > Links:<br /> >> ><br /> >> > <a href="http://www.sai.msu.su/~megera/postgres/talks/gist_tutorial.html">http://www.sai.msu.su/~megera/postgres/talks/gist_tutorial.html</a> -about GIST<br /> >> > <a href="https://toster.ru/q/27135#answer_110197">https://toster.ru/q/27135#answer_110197</a>- people need SP-GIST for cubes<br/> >> > <a href="http://www.slideshare.net/profyclub_ru/o-lt">http://www.slideshare.net/profyclub_ru/o-lt</a>- presentation about indexes<br/> >> > <a href="http://pgconf.ru/static/presentations/2015/korotkov_spatial.pdf">http://pgconf.ru/static/presentations/2015/korotkov_spatial.pdf</a> -working with geo objects<br /> >> ><br /> >> Nice proposal.<br /> >><br /> >> Dynamic Kdtreescan perform badly as the splitting median can get way off as updates are coming. What are your thoughts about that?<br/> >><br /> >> Also what's up with the 4d space? I don't quite get it. These types are 2 or 3 dimensions.<br/> ><br /> ><br /> > I read spgist README one more time. I didn't find the mechanism for maintaininggood balance after updates.<br /> > I think we can use Bkd-Tree, <a href="https://www.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf">https://www.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf</a>. ButIt can be not the best solving.<br /> > I include Research time in 2nd month of timeline.<br /> ><br /> > About4d space. All these types are 2 dimensional.<br /> > Just as in R-tree object is approximated by MBR. MBR for 2d-objectscan be mapped to 4d-point. More general, nd-object MBR can be mapped into 2nd-point. <br /> ><br /> ><p dir="ltr">Thereason I said that is because you pointed performance as one motivation factor (and the lack of balancing propertiescan degrade performance really fast on larger indexes).<p dir="ltr">The Bkd variant seems interesting but I don'tthink spgist provides enough abstraction to implement it.<p dir="ltr">A bounding box can still be inserted/queried witha 2d kdtree so I don't know why you call it 4d. I assume it's a matter of naming.<p dir="ltr">Overall the proposal seemstotally doable, although I have doubts about its usefulness. But I'm no one... You should have some feedback from thementors soon.<br />
pgsql-hackers by date: