Re: extracting location info from string - Mailing list pgsql-sql

From Rob Sargent
Subject Re: extracting location info from string
Date
Msg-id 4DDD7DA2.7040309@gmail.com
Whole thread Raw
In response to Re: extracting location info from string  (Tarlika Elisabeth Schmitz <postgresql3@numerixtechnology.de>)
List pgsql-sql

On 05/25/2011 03:13 PM, Tarlika Elisabeth Schmitz wrote:
> On Wed, 25 May 2011 09:25:48 -0600
> Rob Sargent<robjsargent@gmail.com>  wrote:
>
>>
>>
>> On 05/24/2011 10:57 AM, Lew wrote:
>>> Tarlika Elisabeth Schmitz wrote:
>>>
>>>> CREATE TABLE person
>>>> (
>>>> id integer NOT NULL,
>>>> "name" character varying(256) NOT NULL,
>>>> "location" character varying(256),
>>>> CONSTRAINT person_pkey PRIMARY KEY (id)
>>>> );
>>>>
>>>> this was just a TEMPORARY table I created for quick analysis
>>>> of my CSV data (now renamed to temp_person).
>
> CREATE TABLE country
> (
>    id character varying(3) NOT NULL, -- alpha-3 code
>    "name" character varying(50) NOT NULL,
>    CONSTRAINT country_pkey PRIMARY KEY (id)
> );
>
>
>> To minimize the ultimately quite necessary human adjudication, one
>> might make good use of what is often termed "crowd sourcing":  Keep
>> all the distinct "hand entered" values and a map to the final human
>> assessment.
>
> I was wondering how to do just that. I don't think it would be a good
> idea to hard code this into the clean-up script. Take, for instance,
> variations of COUNTRY.NAME spelling. Where would I store these?
>
> I could do with a concept for this problem, which applies to a lot of
> string-type info.
>
I think you keep your current structures used for deducing the canonical 
forms, but with each unique input encounter you add it to you 
seen-thus-far list which becomes just one more check (possibly the first 
such check).

create table address_input
(   id unique/sequence,   human_input character varying(256),   resolution character varying(256)
)

You may have to add a column for the type of input (if you know for 
instance the input is for street address v. country) or you may want the 
resolution to be portioned in to county, city and so on.



pgsql-sql by date:

Previous
From: Tarlika Elisabeth Schmitz
Date:
Subject: Re: extracting location info from string
Next
From: Charlie
Date:
Subject: Re: [SQL] extracting location info from string