Re: [GENERAL] unicode error and problem - Mailing list pgsql-hackers

From Markus Bertheau
Subject Re: [GENERAL] unicode error and problem
Date
Msg-id 1080161348.1988.6.camel@yarrow.bertheau.de
Whole thread Raw
Responses Re: [GENERAL] unicode error and problem
List pgsql-hackers
В Срд, 24.03.2004, в 11:33, Paolo Supino пишет:
> Hi
>
>   I received a unicode CSV file from someone (the file was created on a
> windows system) and I'm trying to import it into postgresql. When it gets to
> a line that isn't ascii it prints the following error and aborts: "ERROR:
> copy: line 33, Invalid UNICODE character sequence found (0xd956)".

Try to convert the file from UTF-16 (which might be the encoding of the
file) to UTF-8 with iconv:

iconv --from UTF-16 --to UTF-8 file > file.UTF-8

Maybe the file is not in UTF-16 but in some other encoding - convert
accordingly then.

By the way, Unicode is just a number -> glyph mapping, it doesn't say
anything about the representation of that number in the byte stream.
UTF-8 and UTF-16 are such representation specifications.

The encoding name in PostgreSQL should be changed from UNICODE to UTF-8
because UNICODE really just isn't an encoding.

--
Markus Bertheau <twanger@bluetwanger.de>


pgsql-hackers by date:

Previous
From: Andrew Hammond
Date:
Subject: rotatelogs integration in pg_ctl
Next
From: David Garamond
Date:
Subject: Re: subversion vs cvs (Was: Re: linked list rewrite)