Re: extracting location info from string - Mailing list pgsql-sql

From Andrej
Subject Re: extracting location info from string
Date
Msg-id BANLkTikGzsrvXprLXeN3TxZwA=M8fzkn=g@mail.gmail.com
Whole thread Raw
In response to Re: extracting location info from string  (Tarlika Elisabeth Schmitz <postgresql3@numerixtechnology.de>)
Responses Re: extracting location info from string  (Craig Ringer <craig@postnewspapers.com.au>)
Re: extracting location info from string  (Tarlika Elisabeth Schmitz <postgresql3@numerixtechnology.de>)
List pgsql-sql
On 23 May 2011 10:00, Tarlika Elisabeth Schmitz
<postgresql3@numerixtechnology.de> wrote:
> On Sun, 22 May 2011 21:05:26 +0100
> Tarlika Elisabeth Schmitz <postgresql3@numerixtechnology.de> wrote:
>
>>A column contains location information, which may contain any of the
>>following:
>>
>>1) null
>>2) country name (e.g. "France")
>>3) city name, region name (e.g. "Bonn, Nordrhein-Westfalen")
>>4) city name, Rg. region name (e.g. "Frankfurt, Rg. Hessen")
>>5) city name, Rg region name (e.g. "Frankfurt, Rg Hessen")
>
>
> I also need to cope with variations of COUNTRY.NAME and REGION.NAME.

I'm not sure I fully understand your request, but sanitising that data will be
tedious, whichever way you dice it - particularly with the variations
on region.

Another thing of great import is whether the city can occur in the
data column all by itself; if yes, it's next to impossible to distinguish
it from a country.  Specially if we assume that typos/misspelling are
feasible on top of punctuation ... of course, you could create a list
of valid cities and countries, with homophones thrown in for good
measure, and compare & replace things appropriately, w/ name
w/o a clean match being reported for human interaction =o)

If I had a task like that to perform I'd dump the data out to file
and have a good go at it w/ sed & awk, or perl, depending on
how complex & voluminous the data is.



Cheers,
Andrej


--
Please don't top post, and don't use HTML e-Mail :}  Make your quotes concise.

http://www.georgedillon.com/web/html_email_is_evil.shtml


pgsql-sql by date:

Previous
From: Tarlika Elisabeth Schmitz
Date:
Subject: Re: extracting location info from string
Next
From: Craig Ringer
Date:
Subject: Re: extracting location info from string