regular expressions stranges - Mailing list pgsql-hackers

From Teodor Sigaev
Subject regular expressions stranges
Date
Msg-id 45B6054D.2060009@sigaev.ru
Whole thread Raw
Responses Re: regular expressions stranges  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Regexp works differently with no-ascii characters depending on server encoding
(bug.sql contains non-ascii char):

% initdb -E KOI8-R --locale ru_RU.KOI8-R
% psql postgres < bug.sql
  true
------
  t
(1 row)

  true | true
------+------
  t    | t
(1 row)
% initdb -E UTF8 --locale ru_RU.UTF-8
% psql postgres < bug.sql
  true
------
  f
(1 row)

  true | true
------+------
  f    | t
(1 row)

As I can see, that is because of using isalpha (and other is*), tolower &
toupper instead of isw* and tow* functions. Is any reason to use them? If not, I
can modify regc_locale.c similarly to tsearch2 locale part.



--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/
set client_encoding='KOI8';

SELECT    '�' ~* '[[:alpha:]]' as "true";
SELECT
        '������' ~* '������' as "true",
        '������' ~* '������' as "true";

pgsql-hackers by date:

Previous
From: "Pavel Stehule"
Date:
Subject: 10 weeks to feature freeze (Pending Work)
Next
From: Heikki Linnakangas
Date:
Subject: Re: Free space management within heap page