About upper() and lower to handle multibyte char - Mailing list pgsql-general

From Weiping
Subject About upper() and lower to handle multibyte char
Date
Msg-id 41751C0D.2020209@qmail.zhengmai.net.cn
Whole thread Raw
Responses Re: About upper() and lower to handle multibyte char
List pgsql-general
Hi,

while upgrade to 8.0 (beta3) we got some problem:

we have a database which encoding is UNICODE,
when we do queries like:
select upper('中文'); --select some multibyte character,
then postgresql response:

ERROR: invalid multibyte character for locale

but when we do it in a SQL_ASCII encoding database,
it's ok and return unchanged string, that's what we think correct result.

I've searched the archive and found that in 8.0, the upper()/lower()
function have been changed to could handle multibyte character,
but, what's the expected behavior of these two function in coping with
multibyte character?

Another question: from the archive, I know that on system with
<wctype.h> toupper/tolower functions, the postgresql would support
multibyte upper/lower function; my system (slackware 10) got <wctype.h>,
but why still I get the ERROR? How can I check if my postgresql installation
come with multibyte upper/lower support?

The problem make us very difficlut when using upper/lower to deal with
columns with more then one encoding char, like Chinese and English char
in Unicode
database, because the transaction would abort with the error above, that
breaks
our application a lot.

Thanks and any help would be appreciated

Laser


pgsql-general by date:

Previous
From: Jan Wieck
Date:
Subject: Re: Another list for windows port...
Next
From: "Taber, Mark"
Date:
Subject: " CLI describe error: Out of memory while reading tuples."