Re: Postal code radius searches - Mailing list pgsql-general

From wsheldah@lexmark.com
Subject Re: Postal code radius searches
Date
Msg-id 200202061919.OAA12374@interlock2.lexmark.com
Whole thread Raw
In response to Postal code radius searches  (Milo Hyson <milo@cyberlifelabs.com>)
List pgsql-general

I'm sure that some more erudite answers will be forthcoming, but there's one
obvious improvement to the brute-force approach.

Don't calculate exact differences when an estimate will do. If zip codes are as
consistent as I think they are, from low numbers on the east coast to high
numbers on the west coast, you should be able to look at the first two or three
digits of both zip codes and tell whether they're anywhere near each other. I
would expect zip codes '11111' and '99999' to be at least more than 1000 miles
apart, and probably more than that. This is just for a rough estimate though.
How useful it is will partly depend on how big a radius you're typically
searching for.

I also wonder if you couldn't convert degrees longitude difference between two
points to an approximate number of miles, add on some miles for safety, and
restrict your search a bit that way. Do the same for degrees lattitude
difference. That way you restrict the records you have to search before you ever
start using that stored procedure to calculate exact differences.  If you're
looking for 500 miles from Houston, don' t worry about states on the east or
west coast, or next to canada, and probably a couple of states further in on
most sides. Once you get within that rough estimate, go ahead and calculate
exact differences. Though you should be able to assume neighboring zip codes
will be within range as well, without calculating them. If I'm looking for a 500
mile radius, zip codes that are only different in the last digit, or are within
a certain number of degrees lattitude and longitude, will match for sure, and
their exact distance doesn't need to be calculated. So you really only need to
calculate distances for a donut shaped region around the base postal code.

You might look at postgresql's geometric functions, if you haven't already. I
don't know whether an R-tree index would be useful, or even whether it's still
supported? Haven't actually worked with either of these, haven't needed to. Good
luck!

Wes Sheldahl



Milo Hyson <milo%cyberlifelabs.com@interlock.lexmark.com> on 02/06/2002 01:36:18
PM

To:   pgsql-general%postgresql.org@interlock.lexmark.com
cc:    (bcc: Wesley Sheldahl/Lex/Lexmark)
Subject:  [GENERAL] Postal code radius searches


I've been struggling with this problem for a while now and I can't seem to
find a solution. I have a postal-code database, currently populated with over
76,000 United States ZIP codes. Each record contains, among other things, the
latitude and longitude for the postal code. I have a stored procedure that
calculates the distance between any two points on the globe. I'm trying to
figure out a fast way to locate all of the postal codes within an arbitrary
radius of another postal code.

The brute force method requires a sequential scan of all 76,000 records
looking for those that fall within the specified area. A more
high-performance method would be to pre-calculate the distances between all
postal codes (possibly limiting the distance to save space). However, this
requires more than 76,000 ^ 2 database operations. On a 1 GHz box, I
calculated this would take nearly one year complete. It would take twice as
long if I wanted to create a second cache for city/state searches.

Does anybody have and tips on solving this issue? Is there any sort of
complex index I could create based on the results of an arbitrary stored
procedure call? Maybe some custom C code?

--
Milo Hyson
CyberLife Labs, LLC

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)





pgsql-general by date:

Previous
From: "Roderick A. Anderson"
Date:
Subject: Re: Postal code radius searches
Next
From: postgresql@fruru.com
Date:
Subject: Re: Postal code radius searches