Re: Unicode database on non-unicode operating system - Mailing list pgsql-general

From Peter Eisentraut
Subject Re: Unicode database on non-unicode operating system
Date
Msg-id 200807151432.57828.peter_e@gmx.net
Whole thread Raw
In response to Unicode database on non-unicode operating system  ("Morten Barklund" <morten.barklund@tbwa.dk>)
Responses Re: Unicode database on non-unicode operating system
List pgsql-general
Am Dienstag, 15. Juli 2008 schrieb Morten Barklund:
> My problem is, that the lowercase versions of non-ascii characters are
> broken. Specifically I found, that when lower() is invoked on a text with
> non-ascii characters, the operating system's locale is used for converting
> each octet in the string to lowercase in stead of using the locale of the
> database to convert each character in the string to lowercase. This caused
> the danish lower case o with slash "ø", which in unicode is represented as
> the latin1-readable octets "ø", to be converted to the latin1-readable
> octets "ã¸", which then in turn was (tried) to be interpreted as a unicode
> character - but the octects "ã¸" does not represent a unicode character in
> utf8. The lower case version of "ø" is of course just itself.

This means you have mismatching server encodings and locales configured.
Check SHOW lc_collate and SHOW server_encoding, and then pick a combination
that is compatible.  This will probably mean you have to reinitdb.

pgsql-general by date:

Previous
From: "Sergey Konoplev"
Date:
Subject: Re: Referential integrity vulnerability in 8.3.3
Next
From: Richard Huxton
Date:
Subject: Re: Referential integrity vulnerability in 8.3.3