Re: request for help with COPY syntax - Mailing list pgsql-sql

From Chuck D.
Subject Re: request for help with COPY syntax
Date
Msg-id 200710250935.24041.pgsql-list@nullmx.com
Whole thread Raw
In response to Re: request for help with COPY syntax  (Paul Lambert <paul.lambert@autoledgers.com.au>)
Responses Re: request for help with COPY syntax
List pgsql-sql
On October 24, 2007 01:10:59 am Paul Lambert wrote:
>
> I get around this problem with my data loads by specifying some other
> arbitrary character that I know won't appear in the data as the quote
> character.
>
> Eg QUOTE E'\f' will specify form feed as the quote character, ergo any
> data with double or single quotes will be loaded with those quote
> characters in the string.
>
> Something similar may help with your case.

This was the solution.  I specified a quote character that was not in the data 
and the data imported perfectly.  Without specifying any delimiter postgres 
defaults to one of the quotes (I forget which).

Unfortunately, the data I imported wasn't good.  MaxMind, like the 
Geonames.org derivatives, uses FIPS code for a state identifier in the cities 
table for all countries EXCEPT USA in which case they use the iso code.  Both 
these data sets mix types within one column and I find that absolutely 
unacceptable.

Back to my original problem, which was trying to COPY in some of the 
earth-info.nga.mil world city data.  This data is tab delimited, no quotes 
around fields, newline line terminated and UTF-8 encoded.

Using a similar COPY statement with the defaults, it fails with this:

COPY geo.orig_city FROM '/home/www/geo/DATA/nga.mil/geonames_no_header.txt';

ERROR:  literal carriage return found in data
HINT:  Use "\r" to represent carriage return.
CONTEXT:  COPY orig_city, line 1071850

And of course, at that line we find a field that has several lines which 
appear (using cat -A) to be terminated with a new line ($).  I originally 
deleted this line but there are others like it.  And the file is 2 Gigs in 
size so it isn't acceptable to comb through it.

I believe this is a new problem because I have a vintage file dated early 2007 
that didn't have this problem.  Does anyone know how to solve this COPY 
issue?



pgsql-sql by date:

Previous
From: Erik Jones
Date:
Subject: Re: get only rows for latest version of contents
Next
From: "Chuck D."
Date:
Subject: Re: request for help with COPY syntax