Re: Regexps vs. locale - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Regexps vs. locale
Date
Msg-id 200901070444.n074iOM19932@momjian.us
Whole thread Raw
In response to Regexps vs. locale  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
List pgsql-hackers
Added to TODO:
Add ability to use case-insensitive regular expressions on multi-bytecharacters    ILIKE already works with multi-byte
characters       * http://archives.postgresql.org/pgsql-hackers/2008-12/msg00433.php 
 

---------------------------------------------------------------------------

Andrew Gierth wrote:
> This came up on irc:
> 
> postgres=# show lc_ctype;
>   lc_ctype   
> -------------
>  fr_FR.UTF-8
> 
> postgres=# show server_encoding;
>  server_encoding 
> -----------------
>  UTF8
> (1 row)
> 
> postgres=# select E'\303\201' ILIKE  E'\303\241';
>  ?column? 
> ----------
>  t
> (1 row)
> 
> postgres=# select E'\303\201' ~*  E'\303\241';
>  ?column? 
> ----------
>  f
> (1 row)
> 
> Obviously, this happens because the locale support functions in
> backend/regex/regc_locale.c are (presumably intentionally) crippled so
> as not to support non-ascii chars, despite all the code there using
> wide chars for everything otherwise.
> 
> Why is this? It does not appear to be a documented restriction.
> 
> -- 
> Andrew (irc:RhodiumToad)
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: log output of vxid
Next
From: Bruce Momjian
Date:
Subject: Re: Multiplexing SUGUSR1