UNICODE string collating, case insensitive matching - Mailing list pgsql-general

From Cestmir Hybl Jr.
Subject UNICODE string collating, case insensitive matching
Date
Msg-id 020d01c2e286$de2336e0$0200a8c0@stratos
Whole thread Raw
List pgsql-general
Hello,
 
(1) I have a question about multibyte support in PostgreSQL:
 
Why does collating, character case operations (Upper, Lower, ILIKE) in Postgres use libc locales instead of UNICODE specification when using UTF-8 database encoding. This is useless in real multilingual environment, when strings in multiple languages are stored in the same database. Those strings are NOT treatable by single locale.
 
There are several UNICODE technical standards, relevant to this:
  http://www.unicode.org/reports/tr10/ - Unicode Collation Algorithm
 
 
(2) Is there someone, who has pgsql database cluster with UTF-8 encoding, *.UTF-8 locale and Upper, Lower, ILIKE functions working properly?
 
I have compiled sk_SK.UTF-8 locale and string collating works fine (/select ... order by some_field/ query returns properly collated dataset), but (/select Upper(some_field), Lower(some_field)/, and /select ... where some_field ILIKE '%...some non-ASCII text...%'/ does not work.
 
All of this works fine in sk_SK.ISO-8859-2 locale.
 
 
Cestmir Hybl

pgsql-general by date:

Previous
From: Erwin Moller
Date:
Subject: Help, Postgres7.3 --> Postgres7.2.1 database recreation fails
Next
From: Francisco J Reyes
Date:
Subject: Re: reindex vs. drop index , create index