BUG #2625: Case insensitive regexp matching doesn't work on national characters - Mailing list pgsql-bugs

From Zoltan MEZEI
Subject BUG #2625: Case insensitive regexp matching doesn't work on national characters
Date
Msg-id 200609131325.k8DDPUYW059944@wwwmaster.postgresql.org
Whole thread Raw
List pgsql-bugs
The following bug has been logged online:

Bug reference:      2625
Logged by:          Zoltan MEZEI
Email address:      mezei.zoltan@telefor.hu
PostgreSQL version: 8.0.3
Operating system:   Centos Linux 3.7
Description:        Case insensitive regexp matching doesn't work on
national characters
Details:

(the bug is also there in 8.1.4, used libc version is 2.3.2)

Symptom:
select 'á' ~* 'Á';
false
select upper('á') ~* upper('Á');
true

Information:
LC_CTYPE and LC_COLLATE are set to hu_HU.utf8. The database encoding is
UNICODE.

Proposed solution:
The problem is that the regex module doesn't use the functions from
wctype.h, and because of that, it cannot handle multibyte charachters'
upper() properly. It should use wctype functions and the problem is handled.
:-)

pgsql-bugs by date:

Previous
From: "gerrit"
Date:
Subject: BUG #2623: query optimizer not using indexes with inheritance and joins
Next
From: "Marcelo"
Date:
Subject: BUG #2624: Cursor