Re: Best practices for geo-spatial city name searches? - Mailing list pgsql-sql

From Steve Midgley
Subject Re: Best practices for geo-spatial city name searches?
Date
Msg-id 20090225184316.09F9E633204@mail.postgresql.org
Whole thread Raw
In response to Best practices for geo-spatial city name searches?  (Mark Stosberg <mark@summersault.com>)
List pgsql-sql
At 08:20 AM 2/25/2009, pgsql-sql-owner@postgresql.org wrote:
>To: pgsql-sql@postgresql.org
>From:  Mark Stosberg <mark@summersault.com>
>Subject: Best practices for geo-spatial city name searches?
>Date:  Tue, 24 Feb 2009 11:19:56 -0500
>Message-ID:  <20090224111956.5b7a4301@summersault.com>
>X-Archive-Number: 200902/94
>X-Sequence-Number: 32231
>
>Hello,
>
>I use PostgreSQL and the "cube" type to perform geo-spatial zipcode 
>proximity
>searches. I'm wondering about the best practices also supporting a 
>geo-spatial
>distance search based on a city name rather than zipcode.
>
>In our original data model, we used a 'zipcodes' table, with the 
>zipcode as the
>primary key.  This can of course contain a "City Name" column, but 
>there is a
>problem with this, illustrated a "Nome, Alaska" case. Nome's zipcode 
>is 99762.
>It maps to multiple cities including Diomede, Alaska and Nome, Alaska.
>
>In the data model described, only the "Diomede" row is imported, and 
>the other
>rows, including the "Nome, Alaska" row are dropped. So if you try to 
>search
>for Nome, Alaska, you won't find anything.
>
>One solution would be to have a "cities" table, with the city/state as 
>the
>primary key, and a zipcode as an additional column. Then, by joining 
>on the
>zipcodes table, the coordinates for a city could be found.
>
>Is there any other way I should be considering data modelling to 
>support
>searches on zipcodes and cities?

Hi Mark,

I built a very similar system for www.hutz.com. It uses a complete 
postcode database, without dropping nearby/overlapping cities. It also 
includes the postcode "alias" values, which are names that the post 
office uses as "equivalent" to the official names.

Within the city table, I created a series of self-joining id's:

id|alias_city_id|post_code_city_id|muni_city_id

So a city record can be an "alias", "postcode" or "muni" record. A muni 
record is the definitive record for a city (and is defined by the 
postcode record closest to the city center as defined by the USGS). A 
postcode record, represents a postcode (zipcode) region within a city. 
An alias represents an alternate name that either refers to a muni 
record or a postcode record (and is defined as "alias_city_id IS NOT 
NULL")

So if I want to search the table for only muni city records, the query 
looks like

select * from city where id = muni_city_id

I also included lat/long coordinates for every record, making it easy 
to calculate distances and find all city records within a certain 
range, etc. (I used the "point" and "circle" operators for this along 
with a GiST index - it's not perfect for long distances - it assumes 
the earth is flat, but it works great for small distances and is very 
fast).

I hope this helps. Feel free to contact me on-list or off, if you want 
to discuss more.

Steve



pgsql-sql by date:

Previous
From: "A. Kretschmer"
Date:
Subject: Re: Add column by using SELECT statement
Next
From: Ivan Sergio Borgonovo
Date:
Subject: row not deleted but updated (trigger?)