Thread: Howto read a UTF-8 CSV with COPY?

Howto read a UTF-8 CSV with COPY?

From
Andreas
Date:
Hi,
I'd like to import some data from ms-office that has text columns with
international characters in it.

COPY complains about illegal byte sequences.

When I placed a "set client_encoding = LATIN1" in front of the COPY
command, COPY was happy.
It still didn't work as expected. There were empty attributes in the
imported table when there were accent decorated characters in the source
file.

Is there a way to import UTF8 encoded csv files ?

Re: Howto read a UTF-8 CSV with COPY?

From
LazyTrek
Date:
Andreas,

Have you looked at using pgloader? 

http://pgloader.projects.postgresql.org

Also did you save the MS Office file as a simple plain text file?

I'm a novice myself but do know that this offers slightly more advanced options than the simple COPY utility.


On Thu, Nov 11, 2010 at 10:01 PM, Andreas <maps.on@gmx.net> wrote:
Hi,
I'd like to import some data from ms-office that has text columns with international characters in it.

COPY complains about illegal byte sequences.

When I placed a "set client_encoding = LATIN1" in front of the COPY command, COPY was happy.
It still didn't work as expected. There were empty attributes in the imported table when there were accent decorated characters in the source file.

Is there a way to import UTF8 encoded csv files ?

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: Howto read a UTF-8 CSV with COPY?

From
Jasen Betts
Date:
On 2010-11-11, Andreas <maps.on@gmx.net> wrote:
> Hi,
> I'd like to import some data from ms-office that has text columns with
> international characters in it.
>
> COPY complains about illegal byte sequences.
>
> When I placed a "set client_encoding = LATIN1" in front of the COPY
> command, COPY was happy.
> It still didn't work as expected. There were empty attributes in the
> imported table when there were accent decorated characters in the source
> file.

latin-1 was probably the wrong client encoding unless it was office on
a really old Mac.

> Is there a way to import UTF8 encoded csv files ?

copy.
but the files must be utf8, close is noy good enough.

what sequence is it complaining about?

--
⚂⚃ 100% natural

Re: Howto read a UTF-8 CSV with COPY?

From
Andreas
Date:
Am 15.11.2010 19:10, schrieb Jasen Betts:
>
> latin-1 was probably the wrong client encoding unless it was office on
> a really old Mac.
No, its a win xp.


> Is there a way to import UTF8 encoded csv files ?
> copy.
> but the files must be utf8, close is noy good enough.
That might be the problem.
After I let notepad++ convert it to utf8 copy read it without issues so far.
It appeares excel 2000 can't produce the right flavour. Even the
"unicode text" got rejected by copy.


> what sequence is it complaining about?
In most cases it was a local character, though I think even latin1 has
all of them besides the € sign.
And it didnt like - (minus or probaply ndash).


Thanks for your reply.