On October 25, 2007 09:22:10 pm you wrote:
> Did the sed actually do anything? (Hint: the file size of
> geonames_fixed.txt would be larger than geonames.txt if it did.
> Or you could diff the two files to confirm that something sensible
> happened.)
>
> I suspect that your shell may be fouling things up here. You may need
> to prepare this command as a one-line shell script, using an editor that
> doesn't barf on bare carriage returns ...
>
> regards, tom lane
OK, here's a bit more info after testing the sed on a bash line and in
a /bin/sh script edited with vi.
One of the troubling rows from the original file looks like this. This is a
couple fields, not the whole row.
cat -A short.txt
^IJishishan Bonanzu Dongxiangzu Salarzu Zizhixian^M$
Jishishan Bonanzu Dongxiangzu Salarzu Zizhixian^M$
Jishishan Bonanzu Dongxiangzu Salarzu Zizhixian^M$
^I2007-07-06$
Here's the shell script:
/bin/sed 's/^M/\\r/' < short.txt > short.out
And the result:
^IJishishan Bonanzu Dongxiangzu Salarzu Zizhixian\r$
Jishishan Bonanzu Dongxiangzu Salarzu Zizhixian\r$
Jishishan Bonanzu Dongxiangzu Salarzu Zizhixian\r$
^I2007-07-06$
Of course it sees the \r$ as the end of line and fails with:
ERROR: missing data for column "full_name"
Does that help diagnose?