Thread: Ordering and unicode

Ordering and unicode

From
Michael Schuerig
Date:
I have a database on PostgreSQL 8.0.3 with unicode (utf-8) encoding,
client encoding is set to unicode, too. LC_COLLATE for the cluster is
de_DE.iso885915@euro. I noticed that that collation doesn't work for
two-byte characters, apparently they are ordered bytewise.

My current conjecture is that I'd have to re-initialize the cluster with
a utf-8 collation. Is this correct? Right now I don't have access to
the machine and can't check this.

Michael

--
Michael Schuerig                       Face reality and stare it down
mailto:michael@schuerig.de        --Jethro Tull, Silver River Turning
http://www.schuerig.de/michael/

Re: Ordering and unicode

From
Michael Schuerig
Date:
On Thursday 10 November 2005 21:31, Michael Schuerig wrote:
> I have a database on PostgreSQL 8.0.3 with unicode (utf-8) encoding,
> client encoding is set to unicode, too. LC_COLLATE for the cluster is
> de_DE.iso885915@euro. I noticed that that collation doesn't work for
> two-byte characters, apparently they are ordered bytewise.
>
> My current conjecture is that I'd have to re-initialize the cluster
> with a utf-8 collation. Is this correct? Right now I don't have
> access to the machine and can't check this.

Yes, it is the case. Just checked on another installation. Thanks for
listening...

Michael

--
Michael Schuerig                          Thinking is trying to make up
mailto:michael@schuerig.de                for a gap in one's education.
http://www.schuerig.de/michael/                          --Gilbert Ryle

Re: Ordering and unicode

From
Guido Neitzer
Date:
On 10.11.2005, at 21:31 Uhr, Michael Schuerig wrote:

> My current conjecture is that I'd have to re-initialize the cluster
> with
> a utf-8 collation. Is this correct? Right now I don't have access to
> the machine and can't check this.

Yes, as far as I know there is no other way of changing the locale
settings. Hopefully you are on Linux! If you deploy on Mac OS X or
*BSD it won't work even with a change.

cug



Attachment

Re: Ordering and unicode

From
Guido Neitzer
Date:
On 11.11.2005, at 9:33 Uhr, Guido Neitzer wrote:

> Yes, as far as I know there is no other way of changing the locale
> settings. Hopefully you are on Linux! If you deploy on Mac OS X or
> *BSD it won't work even with a change.

I have to correct me: Hopefully you are not on Mac OS X. On Mac OS X,
locale support is not yet available for UTF-8, so you will not get
correct ordering. This is from a statement of an Apple engineer --
they have this on the to do list.

As the locale support in Mac OS X comes directly from BSD I assume,
it's not better there but I have tested this only on one BSD
plattform and looked through the provided locale files (in the given
cvs directories) for others where it seems similar.

This is not a problem of PostgreSQL, only a problem for PostgreSQL
when running on the "wrong" plattform.

cug

Attachment