Thread: Best practices for geo-spatial city name searches?
Hello, I use PostgreSQL and the "cube" type to perform geo-spatial zipcode proximity searches. I'm wondering about the best practices also supporting a geo-spatial distance search based on a city name rather than zipcode. In our original data model, we used a 'zipcodes' table, with the zipcode as the primary key. This can of course contain a "City Name" column, but there is a problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode is 99762. It maps to multiple cities including Diomede, Alaska and Nome, Alaska. In the data model described, only the "Diomede" row is imported, and the other rows, including the "Nome, Alaska" row are dropped. So if you try to search for Nome, Alaska, you won't find anything. One solution would be to have a "cities" table, with the city/state as the primary key, and a zipcode as an additional column. Then, by joining on the zipcodes table, the coordinates for a city could be found. Is there any other way I should be considering data modelling to support searches on zipcodes and cities? Thanks! Mark -- . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark Stosberg Principal Developer mark@summersault.com Summersault, LLC 765-939-9301 ext 202 database driven websites. . . . . http://www.summersault.com/. . . . . . . .
On Tuesday 24 February 2009 08:19:56 am Mark Stosberg wrote: > Hello, > > I use PostgreSQL and the "cube" type to perform geo-spatial zipcode > proximity searches. I'm wondering about the best practices also supporting > a geo-spatial distance search based on a city name rather than zipcode. > > In our original data model, we used a 'zipcodes' table, with the zipcode as > the primary key. This can of course contain a "City Name" column, but > there is a problem with this, illustrated a "Nome, Alaska" case. Nome's > zipcode is 99762. It maps to multiple cities including Diomede, Alaska and > Nome, Alaska. > > In the data model described, only the "Diomede" row is imported, and the > other rows, including the "Nome, Alaska" row are dropped. So if you try to > search for Nome, Alaska, you won't find anything. > > One solution would be to have a "cities" table, with the city/state as the > primary key, and a zipcode as an additional column. Then, by joining on the > zipcodes table, the coordinates for a city could be found. > > Is there any other way I should be considering data modelling to support > searches on zipcodes and cities? > > Thanks! > > Mark > > > -- > . . . . . . . . . . . . . . . . . . . . . . . . . . . > Mark Stosberg Principal Developer > mark@summersault.com Summersault, LLC > 765-939-9301 ext 202 database driven websites > . . . . . http://www.summersault.com/ . . . . . . . . I don't know if this is any help. I recently used google to obtain the longitude and latitude and then used simple math to determine the distance between the locations to determine proximity searches. Like finding the closes store. -- John Fabiani
> -----Mensaje original----- > From: Mark Stosberg > > Hello, > > I use PostgreSQL and the "cube" type to perform geo-spatial > zipcode proximity searches. I'm wondering about the best > practices also supporting a geo-spatial distance search based > on a city name rather than zipcode. > > In our original data model, we used a 'zipcodes' table, with > the zipcode as the primary key. This can of course contain a > "City Name" column, but there is a problem with this, > illustrated a "Nome, Alaska" case. Nome's zipcode is 99762. > It maps to multiple cities including Diomede, Alaska and Nome, Alaska. > > In the data model described, only the "Diomede" row is > imported, and the other rows, including the "Nome, Alaska" > row are dropped. So if you try to search for Nome, Alaska, > you won't find anything. > > One solution would be to have a "cities" table, with the > city/state as the primary key, and a zipcode as an additional > column. Then, by joining on the zipcodes table, the > coordinates for a city could be found. > > Is there any other way I should be considering data modelling > to support searches on zipcodes and cities? > You absolutely need zipcode as a primary key? If you must enforce non duplicate entries use country + state + county + city_name instead. You might still need to throw zipcode into the PK for certain cities (worldwide). Otherwise, latitud & longitude provide a better natural key, or simply use a non data related sequential bigint. Regards.
At 08:20 AM 2/25/2009, pgsql-sql-owner@postgresql.org wrote: >To: pgsql-sql@postgresql.org >From: Mark Stosberg <mark@summersault.com> >Subject: Best practices for geo-spatial city name searches? >Date: Tue, 24 Feb 2009 11:19:56 -0500 >Message-ID: <20090224111956.5b7a4301@summersault.com> >X-Archive-Number: 200902/94 >X-Sequence-Number: 32231 > >Hello, > >I use PostgreSQL and the "cube" type to perform geo-spatial zipcode >proximity >searches. I'm wondering about the best practices also supporting a >geo-spatial >distance search based on a city name rather than zipcode. > >In our original data model, we used a 'zipcodes' table, with the >zipcode as the >primary key. This can of course contain a "City Name" column, but >there is a >problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode >is 99762. >It maps to multiple cities including Diomede, Alaska and Nome, Alaska. > >In the data model described, only the "Diomede" row is imported, and >the other >rows, including the "Nome, Alaska" row are dropped. So if you try to >search >for Nome, Alaska, you won't find anything. > >One solution would be to have a "cities" table, with the city/state as >the >primary key, and a zipcode as an additional column. Then, by joining >on the >zipcodes table, the coordinates for a city could be found. > >Is there any other way I should be considering data modelling to >support >searches on zipcodes and cities? Hi Mark, I built a very similar system for www.hutz.com. It uses a complete postcode database, without dropping nearby/overlapping cities. It also includes the postcode "alias" values, which are names that the post office uses as "equivalent" to the official names. Within the city table, I created a series of self-joining id's: id|alias_city_id|post_code_city_id|muni_city_id So a city record can be an "alias", "postcode" or "muni" record. A muni record is the definitive record for a city (and is defined by the postcode record closest to the city center as defined by the USGS). A postcode record, represents a postcode (zipcode) region within a city. An alias represents an alternate name that either refers to a muni record or a postcode record (and is defined as "alias_city_id IS NOT NULL") So if I want to search the table for only muni city records, the query looks like select * from city where id = muni_city_id I also included lat/long coordinates for every record, making it easy to calculate distances and find all city records within a certain range, etc. (I used the "point" and "circle" operators for this along with a GiST index - it's not perfect for long distances - it assumes the earth is flat, but it works great for small distances and is very fast). I hope this helps. Feel free to contact me on-list or off, if you want to discuss more. Steve