Re: Unicode problem again - Mailing list pgsql-general

From Albe Laurenz
Subject Re: Unicode problem again
Date
Msg-id D960CB61B694CF459DCFB4B0128514C2023F8F0C@exadv11.host.magwien.gv.at
Whole thread Raw
In response to Unicode problem again  (Garry Saddington <garry@schoolteachers.co.uk>)
Responses Re: Unicode problem again  (Michael Fuhr <mike@fuhr.org>)
List pgsql-general
Garry Saddington wrote:
> I have the following error:
>
> Postgres 8.3 via psycopg 1.1.21 and zope 2.10.
>
> ProgrammingError Error Value: ERROR: character 0xe28099 of encoding "UTF8" has no equivalent in "LATIN1" select
distinct 
[...]

This is UNICODE 0x2019, a "right single quotation mark".

This is a "Windows character" - the only non-UNICODE codepages I
know that contain this character are the Microsoft codepages.

Microsoft programs are known to automagically change ASCII
characters to characters like that, so a frequent source of
such characters is copy & paste from a Microsoft text processor.

> I have changed client_encoding to Latin1 to get over errors
> caused by having the database in UTF8 and users trying to
> enter special characters like £ signs.
>
> Unfortunately, it seems there are already UTF8 encodings in
> the DB that have no equivalent in Latin1 from before the change.
>
> How can I get over this problem, and still allow special
> characters, ie have no error reports.

If you want to allow *all* special characters, you will have to
use UNICODE (and a pretty comprehensive font).
You could check if all software that you use supports UNICODE.

By using LATIN1 (or any other non-UNICODE codepage) you allow
*some* special characters. In that case you should not allow all
characters into your database.
You'll have to check data at entry time.
If you are confident that you will never need any non-LATIN1
characters in your database, you could create the database
with LATIN1 encoding; that way there will be an error message at
data entry time.

If you know that all your data is from and for Windows, you could
also use encoding WIN1252 throughout.

Yours,
Laurenz Albe

pgsql-general by date:

Previous
From: Adrian Moisey
Date:
Subject: replication
Next
From: Ivan Sergio Borgonovo
Date:
Subject: table "inheritance" and uniform access