Re: [GENERAL] postgres & server encodings - Mailing list pgsql-admin

From Alvaro Herrera
Subject Re: [GENERAL] postgres & server encodings
Date
Msg-id 20050809171853.GB21039@alvh.no-ip.org
Whole thread Raw
In response to Re: [GENERAL] postgres & server encodings  ("Joel Fradkin" <jfradkin@wazagua.com>)
Responses Re: [GENERAL] postgres & server encodings  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-admin
On Tue, Aug 09, 2005 at 12:56:37PM -0400, Joel Fradkin wrote:
> Not that I am an expert or anything, but my initial data base was SQLASCII
> and I did have to convert it to Unicode.
> My reasons were we store French characters in our database and the newer
> odbc driver was not displaying them correctly coming from SQLASCII, but was
> from UNICODE.
> I also think that it can affect functions like length and upper, but Tom
> knows a ton more then me about this stuff.
>
> I did my initial conversion on 7.4 and the odbc driver at that time had no
> issues with SQLASCII displaying the French, but I think in 8.0.1 I started
> seeing an issue. The latest version of the driver 8.0.4 seems to be working
> well (only up a little over 24 hours thus far).

A conversion will work fine assuming the data is all encoded using the
same encoding.  So if it's all utf8 ("Unicode") already, you can import
it verbatim into a UTF8 database and it will work fine.  If it's all
Latin-1, you can import into a UTF-8 db using a client_encoding=latin1
during import, or verbatim to a Latin-1 database, and it will also work
fine.  (You of course are expected to be able to figure out what
encoding is the data really in.)

The problem only shows up when you have mixed data -- say, you have two
applications, one website in PHP which inserts data in Latin-1, and a
Windows app which inserts in UTF-8.  In this case your data will be a
mess to fix, and there's no way a single conversion will get it right.
You will have to manually separate the parts that are UTF8 from the
Latin1, and import them separately.  Not a position I'd like to be in.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Coge la flor que hoy nace alegre, ufana. ¿Quién sabe si nacera otra mañana?"

pgsql-admin by date:

Previous
From: "Joel Fradkin"
Date:
Subject: Re: [GENERAL] postgres & server encodings
Next
From: Chris Hoover
Date:
Subject: PG 7.3.4 VS PG 8.0.3 Problem