Re: Best practices for geo-spatial city name searches? - Mailing list pgsql-sql
From | Steve Midgley |
---|---|
Subject | Re: Best practices for geo-spatial city name searches? |
Date | |
Msg-id | 20090225184316.09F9E633204@mail.postgresql.org Whole thread Raw |
In response to | Best practices for geo-spatial city name searches? (Mark Stosberg <mark@summersault.com>) |
List | pgsql-sql |
At 08:20 AM 2/25/2009, pgsql-sql-owner@postgresql.org wrote: >To: pgsql-sql@postgresql.org >From: Mark Stosberg <mark@summersault.com> >Subject: Best practices for geo-spatial city name searches? >Date: Tue, 24 Feb 2009 11:19:56 -0500 >Message-ID: <20090224111956.5b7a4301@summersault.com> >X-Archive-Number: 200902/94 >X-Sequence-Number: 32231 > >Hello, > >I use PostgreSQL and the "cube" type to perform geo-spatial zipcode >proximity >searches. I'm wondering about the best practices also supporting a >geo-spatial >distance search based on a city name rather than zipcode. > >In our original data model, we used a 'zipcodes' table, with the >zipcode as the >primary key. This can of course contain a "City Name" column, but >there is a >problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode >is 99762. >It maps to multiple cities including Diomede, Alaska and Nome, Alaska. > >In the data model described, only the "Diomede" row is imported, and >the other >rows, including the "Nome, Alaska" row are dropped. So if you try to >search >for Nome, Alaska, you won't find anything. > >One solution would be to have a "cities" table, with the city/state as >the >primary key, and a zipcode as an additional column. Then, by joining >on the >zipcodes table, the coordinates for a city could be found. > >Is there any other way I should be considering data modelling to >support >searches on zipcodes and cities? Hi Mark, I built a very similar system for www.hutz.com. It uses a complete postcode database, without dropping nearby/overlapping cities. It also includes the postcode "alias" values, which are names that the post office uses as "equivalent" to the official names. Within the city table, I created a series of self-joining id's: id|alias_city_id|post_code_city_id|muni_city_id So a city record can be an "alias", "postcode" or "muni" record. A muni record is the definitive record for a city (and is defined by the postcode record closest to the city center as defined by the USGS). A postcode record, represents a postcode (zipcode) region within a city. An alias represents an alternate name that either refers to a muni record or a postcode record (and is defined as "alias_city_id IS NOT NULL") So if I want to search the table for only muni city records, the query looks like select * from city where id = muni_city_id I also included lat/long coordinates for every record, making it easy to calculate distances and find all city records within a certain range, etc. (I used the "point" and "circle" operators for this along with a GiST index - it's not perfect for long distances - it assumes the earth is flat, but it works great for small distances and is very fast). I hope this helps. Feel free to contact me on-list or off, if you want to discuss more. Steve