Thread: Encoding woes

Encoding woes

From
Bendik R.Johansen
Date:
Hello!
I am having a few encoding related problems.

1. I have a database set up on my Mac OSX machine, PostgreSQL 8.02. I
am trying to dump the tables and load them into Postgres on a Suse 9.2
box. I get the following error:
ERROR:  invalid byte sequence for encoding "UNICODE": 0xe5742c

I have tried to set the client encoding to utf-8 on both machines, but
it does not seem to help. It works fine with LATIN1, but I think I may
need Unicode (see point 2).



2. The system I am setting up will contain a several tables with names
and addresses from many countries (Norway, Spain, Denmark, Sweden
etc.).
   a. Is Unicode the way to go?
   b. Do I need to configure the server in any way, or is this a client
issue?


Any advice would be great!

Thank you


Re: Encoding woes

From
Tom Lane
Date:
"Bendik R.Johansen" <bendik.johansen@gmail.com> writes:
> I am having a few encoding related problems.

> 1. I have a database set up on my Mac OSX machine, PostgreSQL 8.02. I
> am trying to dump the tables and load them into Postgres on a Suse 9.2
> box. I get the following error:
> ERROR:  invalid byte sequence for encoding "UNICODE": 0xe5742c

I'm betting that the database encoding on the source machine was set as
SQL_ASCII, but that you then proceeded to store a lot of data that
wasn't plain ASCII.  To load it into a machine with a different database
encoding, you are going to have to be honest about what the data
actually is.  You should modify the dump file to set "client_encoding"
to the correct description of the source data (Latin1 maybe?).  Postgres
can translate Latin1 to Unicode ... but only if it's told what the data
encoding is.

            regards, tom lane