Re: once again, sorting with Unicode - Mailing list pgsql-sql

From Troy
Subject Re: once again, sorting with Unicode
Date
Msg-id 200302201051.h1KApSSN018184@tksoft.com
Whole thread Raw
In response to Re: once again, sorting with Unicode  (Antti Haapala <antti.haapala@iki.fi>)
List pgsql-sql
You are right, of course. I was thinking in terms of the encoded
data. Applications usually get data in UTF8 or UTF16. If the 
input data is true unicode, then there is no difference in
the byte values (just skip the 0x00 bytes).

Cheers,

Troy



> 
> 
> On Wed, 19 Feb 2003, Troy wrote:
> 
> > > I have a multi-lingual database (currently 11 languages) which sorts
> > > fine in MySQL (8859-1 character set) I have now converted the data to
> > > Unicode and compiled Postgre with unicode support.
> > >
> > > I can select and insert unicode and so was rather pleased about that.
> > > Until I saw that it wasn't working properly when ordering!
> >
> > The cause for the different values is the fact that unicode characters
> > have different numeric values from ISO8859-1 and other encodings. Only
> > ascii values are in sync with unicode numeric values. This I am sure you
> > knew.
> 
> No, ISO8859-1 maps directly to unicode up to U+00FF. So the actual
> _numeric_ values are the same. But actual byte patterns are encoding
> dependent.
> 
> Have you set database encoding to UTF-8? Are you using proper UTF-8
> locales? POSIX compiled locales are often charset dependent.
> 
> -- 
> Antti Haapala
> 
> 
> 
> 



pgsql-sql by date:

Previous
From: "Troy"
Date:
Subject: Re: once again, sorting with Unicode
Next
From: Richard Huxton
Date:
Subject: Re: VIEW or Stored Proc - Is this even possible?