Re: WIP: cross column correlation ... - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: WIP: cross column correlation ...
Date
Msg-id 4D6917AF020000250003B06D@gw.wicourts.gov
Whole thread Raw
In response to WIP: cross column correlation ...  (PostgreSQL - Hans-Jürgen Schönig<postgres@cybertec.at>)
List pgsql-hackers
Greg Stark  wrote:
> Consider the oft-quoted example of a  -- or
>  for Americans.
I'm not sure everyone realizes just how complicated this particular
issue is.  If we can do a good job with U.S. city, state, zip code we
will have something which will handle a lot of cases.
Consider:
(1)  Municipality name isn't unique in the U.S.  Many states besides
Wisconsin have a municipality called Madison (I seem to remember
there were over 20 of them).  So city without state doesn't
necessarily get you anywhere near having a unique zip code or range.
(2)  A large city has a set of zip codes, all starting with the same
first three digits.  So identifying the municipality doesn't always
identify the zip code, although for small cities it often does. 
Madison, Wisconsin has thirty-some zip codes, some of which are
rather specialized and don't see much use.
(3)  Small municipalities surrounded by or adjacent to a large city
may not get their own zip code.  53704 not only covers a large swath
of the northern end of the City of Madison, but is also the zip code
for the Village of Maple Bluff and at least parts of the Township of
Westport.
I guess what I'm saying is that this use case has enough complexity
to make an interesting problem to solve.  It may even be more
challenging than you would want for an initial trial of a technique.
-Kevin


pgsql-hackers by date:

Previous
From: fork
Date:
Subject: Generalized edit function?
Next
From: Heikki Linnakangas
Date:
Subject: Re: wal_sender_delay is still required?