BUG #3433: regexp \m and \M don't work for cyrillic - Mailing list pgsql-bugs

From Andriy Rysin
Subject BUG #3433: regexp \m and \M don't work for cyrillic
Date
Msg-id 200707070020.l670KOnp075994@wwwmaster.postgresql.org
Whole thread Raw
Responses Re: BUG #3433: regexp \m and \M don't work for cyrillic  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged online:

Bug reference:      3433
Logged by:          Andriy Rysin
Email address:      arysin@gmail.com
PostgreSQL version: 8.2.4
Operating system:   Linux
Description:        regexp \m and \M don't work for cyrillic
Details:

psql krym
krym=> \encoding
UTF8
krym=> create table test (txt varchar);
CREATE TABLE
krym=> insert into test values ('latin');
INSERT 0 1
krym=> insert into test values ('кирилиця');
INSERT 0 1
krym=> select * from test;
   txt
----------
 latin
 кирилиця
(2 rows)

krym=> select * from test where txt ~* E'\\mla';
  txt
-------
 latin
(1 row)

krym=> select * from test where txt ~* E'\\mкир';
 txt
-----
(0 rows)

escaping specials in regular expressions \m and \M for beginning of word and
end of word work for latin symbols bug don't for cyrillic

pgsql-bugs by date:

Previous
From: Rainer Bauer
Date:
Subject: Re: BUG #3427: Autovacuum crashed server
Next
From: Tom Lane
Date:
Subject: Re: BUG #3433: regexp \m and \M don't work for cyrillic