On October 24, 2007 01:10:59 am Paul Lambert wrote:
>
> I get around this problem with my data loads by specifying some other
> arbitrary character that I know won't appear in the data as the quote
> character.
>
> Eg QUOTE E'\f' will specify form feed as the quote character, ergo any
> data with double or single quotes will be loaded with those quote
> characters in the string.
>
> Something similar may help with your case.
This was the solution. I specified a quote character that was not in the data
and the data imported perfectly. Without specifying any delimiter postgres
defaults to one of the quotes (I forget which).
Unfortunately, the data I imported wasn't good. MaxMind, like the
Geonames.org derivatives, uses FIPS code for a state identifier in the cities
table for all countries EXCEPT USA in which case they use the iso code. Both
these data sets mix types within one column and I find that absolutely
unacceptable.
Back to my original problem, which was trying to COPY in some of the
earth-info.nga.mil world city data. This data is tab delimited, no quotes
around fields, newline line terminated and UTF-8 encoded.
Using a similar COPY statement with the defaults, it fails with this:
COPY geo.orig_city FROM '/home/www/geo/DATA/nga.mil/geonames_no_header.txt';
ERROR: literal carriage return found in data
HINT: Use "\r" to represent carriage return.
CONTEXT: COPY orig_city, line 1071850
And of course, at that line we find a field that has several lines which
appear (using cat -A) to be terminated with a new line ($). I originally
deleted this line but there are others like it. And the file is 2 Gigs in
size so it isn't acceptable to comb through it.
I believe this is a new problem because I have a vintage file dated early 2007
that didn't have this problem. Does anyone know how to solve this COPY
issue?