Re: Linux x Windows LOCALE/ENCODING compatibility - Mailing list pgsql-general

From Carlos Henrique Reimer
Subject Re: Linux x Windows LOCALE/ENCODING compatibility
Date
Msg-id AANLkTik_fxC2sHv0x7LQ-66sVH8S4dwF4VpAMUdyCGQQ@mail.gmail.com
Whole thread Raw
In response to Re: Linux x Windows LOCALE/ENCODING compatibility  (Filip Rembiałkowski <filip.rembialkowski@gmail.com>)
Responses Re: Linux x Windows LOCALE/ENCODING compatibility  (Marco Colombo <pgsql@esiway.net>)
List pgsql-general
Hi,

I'm currently in the process of moving the data from the Windows server to the new Linux box but facing some problems with the encoding.

Additional configuration information: Windows is running PG 8.3 and the new Linux box is PG 8.4.

Windows dump command:
pg_dump -U postgres -Fc -v -f "f:\backup

Linux restore command:
pg_restore -v -n brasil -d mapas /backup


pg_restore error:
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 3574; 0 40805 TABLE DATA cidade
 postgres
pg_restore: [archiver (db)] COPY failed: ERROR:  character 0x81 of encoding "WIN
1252" has no equivalent in "UTF8"
CONTEXT:  COPY cidade, line 6

I also tried to dump using pg_dump -E UTF8 but then I got:
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR:  character 0x81 of encoding "WIN1252" has no equivalent in "UTF8"
pg_dump: The command was: COPY brasil.cidade (gid, "municpio", "municpi0", uf, longitude, latitude, the_geom) TO stdout;
pg_dump: *** aborted because of error

How can I fix this error?

Thank you!


2010/11/1 Filip Rembiałkowski <filip.rembialkowski@gmail.com>
2010/11/1 Carlos Henrique Reimer <carlos.reimer@opendb.com.br>
>
> Hi,
>
> I currently have my PostgreSQL server running in a windows box and now we're migrating it to a Linux operational system.
>
> Current windows configuration:
> pg_controldata shows the LC_COLLATE and LC_CTYPE are Portuguese_Brasil.1252
> psql \l command shows we have databases with encoding WIN1252 and UTF8
>
> New Linux box:
> Which --locale and --encoding options should I use for the INITDB command to have a compatible locale and encoding cluster?
> Which --encoding option should I use in the CREATEDB command to have a compatible encoding database?

from your description I assume you have 8.3 on Windows,
but you did not tell us which PostgreSQL version you are planning to
deploy on Linux. I'd recommend 8.4.

main difference is that on Windows, UTF-8 encoding can be used with
any locale. on Unix, not.

generally, it is safe to do ``initdb --locale=pt_BR.UTF-8'' and create
fresh databases with default settings.

then, use Linux pg_dump and psql to clone Windows databases. like this:
pg_dump -h windowsbox -p 5432 -U postgres DBNAME | psql DBNAME

UTF-8 is compatible with all client encodings.

you will get compatible bahaviour for your existing apps with
SET client_encoding TO 'whatever'; /* this can be entered as pre-SQL
on every client connection */

see http://www.postgresql.org/docs/8.4/static/multibyte.html,
22.2.3. Automatic Character Set Conversion Between Server and Client


HTH

--
Filip Rembiałkowski
JID,mailto:filip.rembialkowski@gmail.com
http://filip.rembialkowski.net/



--
Reimer
47-3347-1724 47-9183-0547 msn: carlos.reimer@opendb.com.br

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Removing pgsql_tmp files
Next
From: Alvaro Herrera
Date:
Subject: Re: Removing pgsql_tmp files