Thread: Best practices for geo-spatial city name searches?

Best practices for geo-spatial city name searches?

From
Mark Stosberg
Date:
Hello, 

I use PostgreSQL and the "cube" type to perform geo-spatial zipcode proximity
searches. I'm wondering about the best practices also supporting a geo-spatial
distance search based on a city name rather than zipcode.

In our original data model, we used a 'zipcodes' table, with the zipcode as the
primary key.  This can of course contain a "City Name" column, but there is a 
problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode is 99762.
It maps to multiple cities including Diomede, Alaska and Nome, Alaska.

In the data model described, only the "Diomede" row is imported, and the other
rows, including the "Nome, Alaska" row are dropped. So if you try to search
for Nome, Alaska, you won't find anything.

One solution would be to have a "cities" table, with the city/state as the
primary key, and a zipcode as an additional column. Then, by joining on the
zipcodes table, the coordinates for a city could be found.

Is there any other way I should be considering data modelling to support
searches on zipcodes and cities? 

Thanks!
   Mark


-- . . . . . . . . . . . . . . . . . . . . . . . . . . .   Mark Stosberg            Principal Developer
mark@summersault.com    Summersault, LLC       765-939-9301 ext 202     database driven websites. . . . .
http://www.summersault.com/. . . . . . . .
 




Re: Best practices for geo-spatial city name searches?

From
johnf
Date:
On Tuesday 24 February 2009 08:19:56 am Mark Stosberg wrote:
> Hello,
>
> I use PostgreSQL and the "cube" type to perform geo-spatial zipcode
> proximity searches. I'm wondering about the best practices also supporting
> a geo-spatial distance search based on a city name rather than zipcode.
>
> In our original data model, we used a 'zipcodes' table, with the zipcode as
> the primary key.  This can of course contain a "City Name" column, but
> there is a problem with this, illustrated a "Nome, Alaska" case. Nome's
> zipcode is 99762. It maps to multiple cities including Diomede, Alaska and
> Nome, Alaska.
>
> In the data model described, only the "Diomede" row is imported, and the
> other rows, including the "Nome, Alaska" row are dropped. So if you try to
> search for Nome, Alaska, you won't find anything.
>
> One solution would be to have a "cities" table, with the city/state as the
> primary key, and a zipcode as an additional column. Then, by joining on the
> zipcodes table, the coordinates for a city could be found.
>
> Is there any other way I should be considering data modelling to support
> searches on zipcodes and cities?
>
> Thanks!
>
>     Mark
>
>
> --
>  . . . . . . . . . . . . . . . . . . . . . . . . . . .
>    Mark Stosberg            Principal Developer
>    mark@summersault.com     Summersault, LLC
>    765-939-9301 ext 202     database driven websites
>  . . . . . http://www.summersault.com/ . . . . . . . .

I don't know if this is any help.  I recently used google to obtain the 
longitude and latitude and then used simple math to determine the distance 
between the locations to determine proximity searches.  Like finding the 
closes store.

-- 
John Fabiani


Re: Best practices for geo-spatial city name searches?

From
"Fernando Hevia"
Date:
 

> -----Mensaje original-----
> From: Mark Stosberg
> 
> Hello, 
> 
> I use PostgreSQL and the "cube" type to perform geo-spatial 
> zipcode proximity searches. I'm wondering about the best 
> practices also supporting a geo-spatial distance search based 
> on a city name rather than zipcode.
> 
> In our original data model, we used a 'zipcodes' table, with 
> the zipcode as the primary key.  This can of course contain a 
> "City Name" column, but there is a problem with this, 
> illustrated a "Nome, Alaska" case. Nome's zipcode is 99762.
> It maps to multiple cities including Diomede, Alaska and Nome, Alaska.
> 
> In the data model described, only the "Diomede" row is 
> imported, and the other rows, including the "Nome, Alaska" 
> row are dropped. So if you try to search for Nome, Alaska, 
> you won't find anything.
> 
> One solution would be to have a "cities" table, with the 
> city/state as the primary key, and a zipcode as an additional 
> column. Then, by joining on the zipcodes table, the 
> coordinates for a city could be found.
> 
> Is there any other way I should be considering data modelling 
> to support searches on zipcodes and cities? 
> 

You absolutely need zipcode as a primary key? If you must enforce non
duplicate entries use country + state + county + city_name instead. You
might still need to throw zipcode into the PK for certain cities
(worldwide). 
Otherwise, latitud & longitude provide a better natural key, or simply use a
non data related sequential bigint.

Regards.



Re: Best practices for geo-spatial city name searches?

From
Steve Midgley
Date:
At 08:20 AM 2/25/2009, pgsql-sql-owner@postgresql.org wrote:
>To: pgsql-sql@postgresql.org
>From:  Mark Stosberg <mark@summersault.com>
>Subject: Best practices for geo-spatial city name searches?
>Date:  Tue, 24 Feb 2009 11:19:56 -0500
>Message-ID:  <20090224111956.5b7a4301@summersault.com>
>X-Archive-Number: 200902/94
>X-Sequence-Number: 32231
>
>Hello,
>
>I use PostgreSQL and the "cube" type to perform geo-spatial zipcode 
>proximity
>searches. I'm wondering about the best practices also supporting a 
>geo-spatial
>distance search based on a city name rather than zipcode.
>
>In our original data model, we used a 'zipcodes' table, with the 
>zipcode as the
>primary key.  This can of course contain a "City Name" column, but 
>there is a
>problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode 
>is 99762.
>It maps to multiple cities including Diomede, Alaska and Nome, Alaska.
>
>In the data model described, only the "Diomede" row is imported, and 
>the other
>rows, including the "Nome, Alaska" row are dropped. So if you try to 
>search
>for Nome, Alaska, you won't find anything.
>
>One solution would be to have a "cities" table, with the city/state as 
>the
>primary key, and a zipcode as an additional column. Then, by joining 
>on the
>zipcodes table, the coordinates for a city could be found.
>
>Is there any other way I should be considering data modelling to 
>support
>searches on zipcodes and cities?

Hi Mark,

I built a very similar system for www.hutz.com. It uses a complete 
postcode database, without dropping nearby/overlapping cities. It also 
includes the postcode "alias" values, which are names that the post 
office uses as "equivalent" to the official names.

Within the city table, I created a series of self-joining id's:

id|alias_city_id|post_code_city_id|muni_city_id

So a city record can be an "alias", "postcode" or "muni" record. A muni 
record is the definitive record for a city (and is defined by the 
postcode record closest to the city center as defined by the USGS). A 
postcode record, represents a postcode (zipcode) region within a city. 
An alias represents an alternate name that either refers to a muni 
record or a postcode record (and is defined as "alias_city_id IS NOT 
NULL")

So if I want to search the table for only muni city records, the query 
looks like

select * from city where id = muni_city_id

I also included lat/long coordinates for every record, making it easy 
to calculate distances and find all city records within a certain 
range, etc. (I used the "point" and "circle" operators for this along 
with a GiST index - it's not perfect for long distances - it assumes 
the earth is flat, but it works great for small distances and is very 
fast).

I hope this helps. Feel free to contact me on-list or off, if you want 
to discuss more.

Steve