Re: GSoC 2015: SP-GIST for geometrical objects - Mailing list pgsql-hackers

From Arthur Silva
Subject Re: GSoC 2015: SP-GIST for geometrical objects
Date
Msg-id CAO_YK0UgM54LZ-Uq7ucRudGLM2t9APHHUO-mKYWSv15OZaf=uA@mail.gmail.com
Whole thread Raw
In response to Re[2]: [HACKERS] GSoC 2015: SP-GIST for geometrical objects  (Dima Ivanovskiy <dima-iv@mail.ru>)
List pgsql-hackers
<p dir="ltr"><br /> On Mar 27, 2015 6:41 PM, "Dima Ivanovskiy" <<a
href="mailto:dima-iv@mail.ru">dima-iv@mail.ru</a>>wrote:<br /> ><br /> ><br /> >> On Mar 27, 2015 11:08
AM,"Dima Ivanovskiy" <<a href="mailto:dima-iv@mail.ru">dima-iv@mail.ru</a>> wrote:<br /> >> ><br />
>>> Hello, I am Dmitrii, student of Moscow Institute of Physics and Technology<br /> >> ><br />
>>> Abstract:<br /> >> ><br /> >> > I chose project "Indexing prolonged geometrical objects
(i.e.boxes, circles, polygons, not points) with SP-GiST by mapping to 4d-space". <br /> >> > According to the
presentation<br/> >> > <a
href="https://www.pgcon.org/2011/schedule/attachments/197_pgcon-2011.pdf">https://www.pgcon.org/2011/schedule/attachments/197_pgcon-2011.pdf</a><br
/>>> > SP-GIST 3 times faster than GiST in some cases. But GIST supports geometrical data types: <br />
>>> box, circle, polygon with operators: && &> &< &<| >> << <<|
<@@> @ |&> |>> ~ ~=<br /> >> > Popular spatial extension PostGIS doesn't include SP-GIST,
buthas a lot of geometrical features.<br /> >> ><br /> >> > Project details:<br /> >> ><br
/>>> > After meeting with Alexander Korotkov, I wrote some plan. <br /> >> > Using of K-D-tree and
Quadtreein building index for geometrical data types can increase speed of search in some cases.<br /> >> >
Themain idea is representing 2-D geometrical objects in their bounding box. Set of 2-D boxes is 4-D space. <br />
>>> New _ops will work with points from 4-D space, for example kd_box_ops, quad_circle_ops and will support
allgeometrical operators. <br /> >> > After conversion object to their bounding box algo has set of tuples
(x1,y1, x2, y2). <br /> >> > Our goal is separate this space the most equally. If we talk about K-D-tree, on
firststep K-D-tree algorithm will split space in 2 parts by the first coordinate, in next step by the second coordinate
etc.,after 4-th coordinate we repeat this procedure. <br /> >> > At the end we have index at geometrical
objectsand use traversal tree for every search operator. <br /> >> ><br /> >> > Postgresql has
alreadyhas realization ideas of MBR in gist/gistproc.c. So I will transfer this realization to other type of tree.<br
/>>> ><br /> >> > Of cource, I assume that SP-GIST can be not the best decision of this problem. So
aftertesting this clear methods, I will try to find more effective way. Maybe with using combination of different
spatialtree structures.<br /> >> ><br /> >> > Project Schedule:<br /> >> ><br /> >>
>until May 25<br /> >> ><br /> >> > Read documentation and source code, clarify details of
implementation.<br/> >> ><br /> >> > 1st month<br /> >> ><br /> >> > Implement new
'_ops'with all geometrical operators for box, circle, polygon<br /> >> ><br /> >> > 2nd month<br />
>>><br /> >> > Research new methods for increase speed of geometrical query<br /> >> ><br />
>>> 3rd month<br /> >> ><br /> >> > Final refactoring, testing and submitting a patch.<br />
>>><br /> >> ><br /> >> > Links:<br /> >> ><br /> >> > <a
href="http://www.sai.msu.su/~megera/postgres/talks/gist_tutorial.html">http://www.sai.msu.su/~megera/postgres/talks/gist_tutorial.html</a>
-about GIST<br /> >> > <a
href="https://toster.ru/q/27135#answer_110197">https://toster.ru/q/27135#answer_110197</a>- people need SP-GIST for
cubes<br/> >> > <a
href="http://www.slideshare.net/profyclub_ru/o-lt">http://www.slideshare.net/profyclub_ru/o-lt</a>- presentation about
indexes<br/> >> > <a
href="http://pgconf.ru/static/presentations/2015/korotkov_spatial.pdf">http://pgconf.ru/static/presentations/2015/korotkov_spatial.pdf</a>
-working with geo objects<br /> >> ><br /> >> Nice proposal.<br /> >><br /> >> Dynamic
Kdtreescan perform badly as the splitting median can get way off as updates are coming. What are your thoughts about
that?<br/> >><br /> >> Also what's up with the 4d space? I don't quite get it. These types are 2 or 3
dimensions.<br/> ><br /> ><br /> > I read spgist README one more time. I didn't find the mechanism for
maintaininggood balance after updates.<br /> > I think we can use Bkd-Tree, <a
href="https://www.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf">https://www.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf</a>.
ButIt can be not the best solving.<br /> > I include Research time in 2nd month of timeline.<br /> ><br /> >
About4d space. All these types are 2 dimensional.<br /> > Just as in R-tree object is approximated by MBR. MBR for
2d-objectscan be mapped to 4d-point. More general, nd-object MBR can be mapped into 2nd-point. <br /> ><br /> ><p
dir="ltr">Thereason I said that is because you pointed performance as one motivation  factor (and the lack of balancing
propertiescan degrade performance really fast on larger indexes).<p dir="ltr">The Bkd variant seems interesting but I
don'tthink spgist provides enough abstraction to implement it.<p dir="ltr">A bounding box can still be inserted/queried
witha 2d kdtree so I don't know why you call it 4d. I assume it's a matter of naming.<p dir="ltr">Overall the proposal
seemstotally doable, although I have doubts about its usefulness. But I'm no one... You should have some feedback from
thementors soon.<br /> 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Rounding to even for numeric data type
Next
From: Andrew Gierth
Date:
Subject: Re: Rounding to even for numeric data type