Re: GiST index question: performance - Mailing list pgsql-sql

From Steve Midgley
Subject Re: GiST index question: performance
Date
Msg-id 20070305213005.E6BEC9FB265@postgresql.org
Whole thread Raw
In response to Re: GiST index question: performance  (Oleg Bartunov <oleg@sai.msu.su>)
Responses Re: GiST index question: performance  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-sql
Thanks Oleg - very interesting stuff you are working on. <br /><br /> You may recall I exchanged emails with you on
openftsa little while ago - my ISP that manages my Pg SQL server is (in my interests) concerned about installing
anythingnon-standard (read: unstable) onto their server. I was able to get them to install your TSearch2 b/c it's been
provenmany times, but I'm hesitant to even bring up Q3C since it's less widely deployed.<br /><br /> The search method
Iproposed in my first email is not totally accurate but just searching circles with radii using a GiST index and
standardPg circle datatypes seems like a "close enough" solution for me (as opposed to Q3C's conical search
intersectionswith a spherical projection). I realize that at higher latitudes my circles will be elliptical but our
needsare for approximations that are <i>very</i> fast rather than accurate and the radii being searched are small
relativeto the size of the sphere (I.e. when searching Nome, find everything in +/- 40 miles and especially don't
returnAnchorage POI).. <br /><br /> It's an end user database, so if the query takes 500ms, that's really too long. On
theQ3C site, I see that your measure of speed is processing many, many rows in 20 hours, which is a whole different
ballgame.:)<br /><br /> Do you have a thought as to whether GiST is going to be faster/more efficient with Pg standard
typesof polygons or circles? I suppose I should just test out both, and quit wasting your time. I'll certainly repost
tothe list with whatever I uncover.<br /><br /> I really do appreciate the help you've provided.<br /><br />
Sincerely,<br/><br /> Steve<br /><br /><br /><br /> At 12:21 PM 3/5/2007, you wrote:<br /><blockquote cite=""
class="cite"type="cite">On Mon, 5 Mar 2007, Steve Midgley wrote:<br /><br /><blockquote cite="" class="cite"
type="cite">Hi,<br/><br /> First off, can I say how much I love GiST? It's already solved a few problems for me that
seemedimpossible to solve in real-time queries. Thanks to everyone who works on that project!</blockquote><br />
Thanks,Steve !<br /><br /><blockquote cite="" class="cite" type="cite"><br /> I'm developing a geographic index based
ona set of zip code boundaries. Points of interest (POI) will fall within some boundaries and not others. I need to
searchto find which POI are within a specified boundary.</blockquote><br /> You POI is what we call ConeSearch query in
astronomy.<br/> Please, take a look on Q3C algorithm available from <a eudora="autourl"
href="http://q3c.sf.net/">http://q3c.sf.net</a>.<br/> Some information <a eudora="autourl"
href="http://www.sai.msu.su/~megera/wiki/SkyPixelization">http://www.sai.msu.su/~megera/wiki/SkyPixelization</a><br
/><br/> This is what we use in our Virtual Observatory project and we're able to<br /> work with 10^9 objects on
moderatehardware. It doesn't use GiST but<br /> special pixelization scheme allow to use standard Btree.<br /><br
/><blockquotecite="" class="cite" type="cite"><br /> I think have two options (see below) and I'm wondering if anyone
hasan opinion or experience as to whether one or the other will have substantially different performance
characteristics.I can obviously test when I get that far, but I'd prefer to try the anticipated faster route first, if
anyonehas existing experience they can share:<br /><br /> 1) Index a series of circles of NN radius around each
boundarymarker (lat/long point). Run a search on POI for those that fall within any of the specified circles.<br /><br
/>2) Index a set of polygons that mark the "minimum area" around the boundary markers in question. Run a search on POI
thatfall within this single polygon.<br /><br /> The polygon will have more points, but there will be more circles to
search- my understanding of GiST is limited so I'm not sure if there's a performance benefit to searching many circles
ora few polygons.<br /><br /> My tables are of this size:<br /><br /> # of POI: 50,000<br /> # of zip blocks (with and
withoutregions): 217,000<br /> # of zip blocks in a given city (and hence in a given polygon): ~5<br /><br /> Any
thoughtsor ideas?<br /><br /> Thank you,<br /><br /> Steve<br /><br /> p.s. I could use a GIS system alongside of
Postgresbut performance and efficiency are key to this system, and it seems to me that raw GiST indexed SQL queries are
goingto be fastest and create the lowest load on the server?<br /><br /><br /> ---------------------------(end of
broadcast)---------------------------<br/> TIP 7: You can help support the PostgreSQL project by donating at<br /><br
/>              <a eudora="autourl" href="http://www.postgresql.org/about/donate">
http://www.postgresql.org/about/donate</a><br/></blockquote><br />          Regards,<br />                  Oleg<br />
_____________________________________________________________<br/> Oleg Bartunov, Research Scientist, Head of AstroNet
(<aeudora="autourl" href="http://www.astronet.ru/">www.astronet.ru</a> ),<br /> Sternberg Astronomical Institute,
MoscowUniversity, Russia<br /> Internet: oleg@sai.msu.su, <a eudora="autourl" href="http://www.sai.msu.su/~megera/">
http://www.sai.msu.su/~megera/</a><br/> phone: +007(495)939-16-83, +007(495)939-23-83<br /><br /><br /></blockquote> 

pgsql-sql by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: GiST index question: performance
Next
From: Peter Eisentraut
Date:
Subject: Re: GiST index question: performance