Regexp works differently with no-ascii characters depending on server encoding
(bug.sql contains non-ascii char):
% initdb -E KOI8-R --locale ru_RU.KOI8-R
% psql postgres < bug.sql
true
------
t
(1 row)
true | true
------+------
t | t
(1 row)
% initdb -E UTF8 --locale ru_RU.UTF-8
% psql postgres < bug.sql
true
------
f
(1 row)
true | true
------+------
f | t
(1 row)
As I can see, that is because of using isalpha (and other is*), tolower &
toupper instead of isw* and tow* functions. Is any reason to use them? If not, I
can modify regc_locale.c similarly to tsearch2 locale part.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
set client_encoding='KOI8';
SELECT '�' ~* '[[:alpha:]]' as "true";
SELECT
'������' ~* '������' as "true",
'������' ~* '������' as "true";