Thread: BUG #2625: Case insensitive regexp matching doesn't work on national characters

BUG #2625: Case insensitive regexp matching doesn't work on national characters

From
"Zoltan MEZEI"
Date:
The following bug has been logged online:

Bug reference:      2625
Logged by:          Zoltan MEZEI
Email address:      mezei.zoltan@telefor.hu
PostgreSQL version: 8.0.3
Operating system:   Centos Linux 3.7
Description:        Case insensitive regexp matching doesn't work on
national characters
Details:

(the bug is also there in 8.1.4, used libc version is 2.3.2)

Symptom:
select 'á' ~* 'Á';
false
select upper('á') ~* upper('Á');
true

Information:
LC_CTYPE and LC_COLLATE are set to hu_HU.utf8. The database encoding is
UNICODE.

Proposed solution:
The problem is that the regex module doesn't use the functions from
wctype.h, and because of that, it cannot handle multibyte charachters'
upper() properly. It should use wctype functions and the problem is handled.
:-)