Thread: Ordering and unicode
I have a database on PostgreSQL 8.0.3 with unicode (utf-8) encoding, client encoding is set to unicode, too. LC_COLLATE for the cluster is de_DE.iso885915@euro. I noticed that that collation doesn't work for two-byte characters, apparently they are ordered bytewise. My current conjecture is that I'd have to re-initialize the cluster with a utf-8 collation. Is this correct? Right now I don't have access to the machine and can't check this. Michael -- Michael Schuerig Face reality and stare it down mailto:michael@schuerig.de --Jethro Tull, Silver River Turning http://www.schuerig.de/michael/
On Thursday 10 November 2005 21:31, Michael Schuerig wrote: > I have a database on PostgreSQL 8.0.3 with unicode (utf-8) encoding, > client encoding is set to unicode, too. LC_COLLATE for the cluster is > de_DE.iso885915@euro. I noticed that that collation doesn't work for > two-byte characters, apparently they are ordered bytewise. > > My current conjecture is that I'd have to re-initialize the cluster > with a utf-8 collation. Is this correct? Right now I don't have > access to the machine and can't check this. Yes, it is the case. Just checked on another installation. Thanks for listening... Michael -- Michael Schuerig Thinking is trying to make up mailto:michael@schuerig.de for a gap in one's education. http://www.schuerig.de/michael/ --Gilbert Ryle
On 10.11.2005, at 21:31 Uhr, Michael Schuerig wrote: > My current conjecture is that I'd have to re-initialize the cluster > with > a utf-8 collation. Is this correct? Right now I don't have access to > the machine and can't check this. Yes, as far as I know there is no other way of changing the locale settings. Hopefully you are on Linux! If you deploy on Mac OS X or *BSD it won't work even with a change. cug
Attachment
On 11.11.2005, at 9:33 Uhr, Guido Neitzer wrote: > Yes, as far as I know there is no other way of changing the locale > settings. Hopefully you are on Linux! If you deploy on Mac OS X or > *BSD it won't work even with a change. I have to correct me: Hopefully you are not on Mac OS X. On Mac OS X, locale support is not yet available for UTF-8, so you will not get correct ordering. This is from a statement of an Apple engineer -- they have this on the to do list. As the locale support in Mac OS X comes directly from BSD I assume, it's not better there but I have tested this only on one BSD plattform and looked through the provided locale files (in the given cvs directories) for others where it seems similar. This is not a problem of PostgreSQL, only a problem for PostgreSQL when running on the "wrong" plattform. cug