Thread: help with utf8

help with utf8

From
Hugo
Date:
Hi

can anybody help with this error, I attatched an image


thanks

Posgtrest 8.1.0 on WinXP Pro SP1

Attachment

Re: help with utf8

From
Hugo
Date:
don't know if this is the best solution but I dropped the database and recreated it with SQL_ASCII instead of UTF-8 and everything worked as expected

Hugo

On 01/12/05, Hugo <htakada@gmail.com> wrote:
Hi

can anybody help with this error, I attatched an image


thanks

Posgtrest 8.1.0 on WinXP Pro SP1



Re: help with utf8

From
Richard Huxton
Date:
Hugo wrote:
> don't know if this is the best solution but I dropped the database and
> recreated it with SQL_ASCII instead of UTF-8 and everything worked as
> expected

It would SQL_ASCII just accepts whatever it's given.

Hugo - a quick tip for future questions, just post the text of the
message not a picture of it. The picture is much larger than the text
and can't be indexed by Google or the mailing-list archive software.

The error you are getting is "invalid UTF-8 byte sequence". I would
guess it's the point in the data marked by the question-mark: "ALEJO
GARCIA N?367"

1. Are you sure that the byte-sequence you have is UTF-8 (and not UTF-16
or other)?
2. How have you tested this?

--
   Richard Huxton
   Archonet Ltd

Re: help with utf8

From
Hugo
Date:
Richard,

I have to import data from a CSV file with the pgsql COPY command, I just know that this file comes from an old program written in Clipper (or something like under linux) , my database was created with UTF8,  I cant find how to create it with UTF-16 on windows, what is the recomended encoding for Postgres on windows spanish version.

Yes, the problem is in that point , it shoud read "ALEJO GARCIA Nº 367".

thanks for you help

Hugo


On 01/12/05, Richard Huxton <dev@archonet.com> wrote:
Hugo wrote:
> don't know if this is the best solution but I dropped the database and
> recreated it with SQL_ASCII instead of UTF-8 and everything worked as
> expected

It would SQL_ASCII just accepts whatever it's given.

Hugo - a quick tip for future questions, just post the text of the
message not a picture of it. The picture is much larger than the text
and can't be indexed by Google or the mailing-list archive software.

The error you are getting is "invalid UTF-8 byte sequence". I would
guess it's the point in the data marked by the question-mark: "ALEJO
GARCIA N?367"

1. Are you sure that the byte-sequence you have is UTF-8 (and not UTF-16
or other)?
2. How have you tested this?

--
   Richard Huxton
   Archonet Ltd

Re: help with utf8

From
Richard Huxton
Date:
Hugo wrote:
> Richard,
>
> I have to import data from a CSV file with the pgsql COPY command, I just
> know that this file comes from an old program written in Clipper (or
> something like under linux) , my database was created with UTF8,  I cant
> find how to create it with UTF-16 on windows, what is the recomended
> encoding for Postgres on windows spanish version.

Postgresql only supports UTF-8. If it's just Spanish data, you might be
using one of ISO-8859-1 or ISO 8859-15 (mostly the same as -1 but with
the Euro). These are also known as Latin1 and Latin9. See link below for
a list of various character-sets available.
   http://www.jbrowse.com/text/charsets.html

Alternatively, it might be in some Windows-specific character set, if
that's where the file came from. On Linux you should be able to use the
"file" utility to identify the character-set of some text.

Make yourself a coffee and sit down and read the "localization" section
of the manuals, Make sure you understand the difference between server
and client encodings and you might find the bit on conversions useful too.

--
   Richard Huxton
   Archonet Ltd