Thread: "People near me" query
Imagine an Orkut-like site. Suppose we have 'person' table of 100k people. About 75% of these people fill in their location (City/State/Country) information. We also have a 'city' table containing list of cities with their state & country and each city's latitude/longitude. Assume all people's location is registered in the 'city' table. How does one design a database to be able to process "Show me people that live no farther than 250 miles from where I live" quickly? I can do "Show me people that live within (A-X to A+X) latitude and (B-X to B+X) longitude" though. (Where A and B is the latitude and longitude [of the person], and X is some numeric value. -- dave
David Garamond wrote: > Imagine an Orkut-like site. Suppose we have 'person' table of 100k > people. About 75% of these people fill in their location > (City/State/Country) information. We also have a 'city' table > containing list of cities with their state & country and each city's > latitude/longitude. Assume all people's location is registered in the > 'city' table. > > How does one design a database to be able to process "Show me people > that live no farther than 250 miles from where I live" quickly? I can > do "Show me people that live within (A-X to A+X) latitude and (B-X to > B+X) longitude" though. (Where A and B is the latitude and longitude > [of the person], and X is some numeric value. > Have you considered using PostGIS? http://postgis.refractions.net It will do all sorts of spatial queries for you including all of what you metioned and lots lots more. There are plenty of people who use it (including me) and it performs very well. I am not sure how easy it is to install, my colleague does that bit, but to use it is really quite simple. Nick
Le ven 19/03/2004 à 11:05, David Garamond a écrit : > Imagine an Orkut-like site. Suppose we have 'person' table of 100k > people. About 75% of these people fill in their location > (City/State/Country) information. We also have a 'city' table containing > list of cities with their state & country and each city's > latitude/longitude. Assume all people's location is registered in the > 'city' table. > > How does one design a database to be able to process "Show me people > that live no farther than 250 miles from where I live" quickly? I can do > "Show me people that live within (A-X to A+X) latitude and (B-X to B+X) > longitude" though. (Where A and B is the latitude and longitude [of the > person], and X is some numeric value. Hi David, I think the answer depends on the precision you want. Reading your post, it doesn't seem to me you need a lot of precision. Usually your technique could do for an approximation. If you want to be more precise though, you would have to use a completely different calculation or structure. Did you think about the fact that longitude is different in terms of distance if you look it from here or from the equator? Also, using a difference in terms of longitude and latitude just by making a subtraction will give you persons which are actually located at more than sqrt(2) times 250 miles. Making it 350 miles sometimes (and that's still flying like a bird). A more precise way of doing this would be to keep a table with distances between cities, and then calculate your way to your destination by taking the shorter path (and that's only a question of distance, not time)... That's really a lot more calculations. It depends on what you need. Yannick
Yannick Warnier wrote: >>Imagine an Orkut-like site. Suppose we have 'person' table of 100k >>people. About 75% of these people fill in their location >>(City/State/Country) information. We also have a 'city' table containing >>list of cities with their state & country and each city's >>latitude/longitude. Assume all people's location is registered in the >>'city' table. >> >>How does one design a database to be able to process "Show me people >>that live no farther than 250 miles from where I live" quickly? I can do >>"Show me people that live within (A-X to A+X) latitude and (B-X to B+X) >>longitude" though. (Where A and B is the latitude and longitude [of the >>person], and X is some numeric value. > > I think the answer depends on the precision you want. Reading your post, > it doesn't seem to me you need a lot of precision. > > Usually your technique could do for an approximation. If you want to be > more precise though, you would have to use a completely different > calculation or structure. > > Did you think about the fact that longitude is different in terms of > distance if you look it from here or from the equator? Yes, I'm aware about the [near-]spherical nature of the Earth. I *was* afraid that I would need to store the distances between cities since that would mean the distance table size would be (city table)**2. I'll check PostGIS out, thanks. -- dave
In article <405AC5D7.1090906@zara.6.isreserved.com>, David Garamond <lists@zara.6.isreserved.com> writes: > Imagine an Orkut-like site. Suppose we have 'person' table of 100k > people. About 75% of these people fill in their location > (City/State/Country) information. We also have a 'city' table > containing list of cities with their state & country and each city's > latitude/longitude. Assume all people's location is registered in the > 'city' table. > How does one design a database to be able to process "Show me people > that live no farther than 250 miles from where I live" quickly? I can > do "Show me people that live within (A-X to A+X) latitude and (B-X to > B+X) longitude" though. (Where A and B is the latitude and longitude > [of the person], and X is some numeric value. On a flat surface, this gives you a square where the corners are too far away, but it's a nice way to weed all the records definitely outside of the circle. For the remaining records, you could use sqrt (dx**2 + dy**2) in your application. Things get worse if you want to take the spherical nature of the surface into account, but are you sure you need that?
* Nick Barr (nicky@chuckie.co.uk) wrote: > http://postgis.refractions.net I second this recommendation. Additionally, consider checking out GDAL, ogr2ogr and the TIGER dataset provided by the US Census (if you're in the US anyway). It provides information about basically all the streets, landmarks, etc in the country. > It will do all sorts of spatial queries for you including all of what > you metioned and lots lots more. There are plenty of people who use it > (including me) and it performs very well. I am not sure how easy it is > to install, my colleague does that bit, but to use it is really quite > simple. I didn't have too much trouble building/installing it. In fact, I'm in the process of putting together a Debian postgresql-postgis package which will handle installing it for you on Debian systems. My main concern with that package atm is how to best handle database upgrades. I'm waiting to see the new multi-version installation stuff the PostgreSQL Debian maintainer is doing and see how that will change what I need to do with PostGIS. I'm certainly interested in having testers for the Debian package though. Anyone who's interested please contact me off-list. Stephen
Attachment
I've done this, it is easy. Well, in the UK anyway. We have something called the national grid (http://www.gps.gov.uk/natgrid/introduction.asp) But it should be fairly easy to convert long/lat to a simpler grid for your country. If you haven't read the intro to thr grid, it is basicaly a 0 point somewhere south east of england and coordinates are given in meters east and north. You can subscribe to databases that map postcodes to coordinates. Which is what I will do when the site goes live, but in the mean time I am stealing them from http://www.streetmap.co.uk/. (x and y in the map page's URL. Search for "SW15 1NY") Once you have that, the rest is easy. create a column of the type "point" and store the grid coordinates in there. The just use the "contains" operator (~) in a query. (http://www.postgresql.org/docs/7.4/static/functions-geometry.html) Example: My coordinates are 523857,175349. So to find anyone living within 10KM of me, I just do "select * from people where '((523857,175349),10000)' ~ location" Unfortunately, Postgres doesn't know how to index this. So make sure you have some other things narrowing it down using an index (m/f, age, etc.) to avoid a full table scan. Hope that helps, Bas.
"Bas Scheffers" <bas@scheffers.net> writes: > Once you have that, the rest is easy. create a column of the type "point" > and store the grid coordinates in there. The just use the "contains" > operator (~) in a query. > (http://www.postgresql.org/docs/7.4/static/functions-geometry.html) > Example: > My coordinates are 523857,175349. So to find anyone living within 10KM of > me, I just do "select * from people where '((523857,175349),10000)' ~ > location" > Unfortunately, Postgres doesn't know how to index this. You can index such queries using rtree indexes. There was a discussion of this with a full example just a couple weeks ago: http://archives.postgresql.org/pgsql-novice/2004-03/msg00070.php It's likely that PostGIS provides an even better solution, but I haven't used it. regards, tom lane
On Fri, Mar 19, 2004 at 17:05:11 +0700, David Garamond <lists@zara.6.isreserved.com> wrote: > Imagine an Orkut-like site. Suppose we have 'person' table of 100k > people. About 75% of these people fill in their location > (City/State/Country) information. We also have a 'city' table containing > list of cities with their state & country and each city's > latitude/longitude. Assume all people's location is registered in the > 'city' table. > > How does one design a database to be able to process "Show me people > that live no farther than 250 miles from where I live" quickly? I can do > "Show me people that live within (A-X to A+X) latitude and (B-X to B+X) > longitude" though. (Where A and B is the latitude and longitude [of the > person], and X is some numeric value. The earthdistance contrib package allows you to do these kinds of queries with gist indexes. This might be a simpler solution than using PostGIS which has a lot of other features you don't appear to be using.
The earthdistance package is great. In conjunction with one of the many zip code databases available on the net, here's a simple function PHP that does returns a bunch of zipcodes close to you, along with the mileage of each. // Get zipcodes for a radius function getzipcodes ($zipcode="", $radius=10) { $zip = lib_getsql("SELECT latitude,longitude FROM zipcodes WHERE zip='$zipcode'"); $istartlat = $zip[0][latitude]; $istartlong = $zip[0][longitude]; $iradius = $radius; $latrange = $iradius / ((6067.0/5280.0) * 60.0); $longrange = $iradius / (((cos($istartlat * pi() / 180) * 6076.0) / 5280.0) * 60); $lowlatitude = $istartlat - $latrange; $highlatitude = $istartlat + $latrange; $lowlongitude = $istartlong - $longrange; $highlongitude = $istartlong + $longrange; $sql = "SELECT zipcode, point($istartlat,$istartlong) <@> point(latitude,longitude) as miles FROM zipcodes WHERE (latitude BETWEEN $lowlatitude AND $highlatitude) AND (longitude BETWEEN $lowlongitude AND $highlongitude) ORDER BY miles LIMIT 500"; $results = lib_getsql($sql); return($results); } No doubt you can get an apropo data file for other countries. - Ericson Smith Bruno Wolff III wrote: >On Fri, Mar 19, 2004 at 17:05:11 +0700, > David Garamond <lists@zara.6.isreserved.com> wrote: > > >>Imagine an Orkut-like site. Suppose we have 'person' table of 100k >>people. About 75% of these people fill in their location >>(City/State/Country) information. We also have a 'city' table containing >>list of cities with their state & country and each city's >>latitude/longitude. Assume all people's location is registered in the >>'city' table. >> >>How does one design a database to be able to process "Show me people >>that live no farther than 250 miles from where I live" quickly? I can do >>"Show me people that live within (A-X to A+X) latitude and (B-X to B+X) >>longitude" though. (Where A and B is the latitude and longitude [of the >>person], and X is some numeric value. >> >> > >The earthdistance contrib package allows you to do these kinds of queries >with gist indexes. This might be a simpler solution than using PostGIS >which has a lot of other features you don't appear to be using. > >---------------------------(end of broadcast)--------------------------- >TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) > > >