Re: regexp character class locale awareness patch - Mailing list pgsql-hackers

From Manuel Sugawara
Subject Re: regexp character class locale awareness patch
Date
Msg-id m38z7otj1t.fsf@dep4.fciencias.unam.mx
Whole thread Raw
In response to Re: regexp character class locale awareness patch  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: regexp character class locale awareness patch  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Re: regexp character class locale awareness patch  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
According to POSIX -regex (7)-, standard character class are:
             alnum       digit       punct             alpha       graph       space             blank       lower
upper             cntrl       print       xdigi
 

Many of that classes are different in different locales, and currently
all work as if the localization were C. Many of those tests have
multibyte issues, however with the patch postgres will work for
one-byte encondings, which is better than nothing. If someone
(Tatsuo?) gives some advice I will work in the multibyte version.

Peter Eisentraut <peter_e@gmx.net> writes:
>
> Basically, you manually preprocess the patch to include the
> USE_LOCALE branch and remove the not USE_LOCALE branch.

Yeah, that should work. You may also remove include/regex/cclass.h
since it will not be used any more.

> However, if the no-locale branches have significant performance
> benefits then it might be worth pondering setting up some
> optimizations.

This is not the case.

Regards,
Manuel.


pgsql-hackers by date:

Previous
From: Thomas Lockhart
Date:
Subject: Re: Bug #633: CASE statement evaluation does not short-circut
Next
From: Tatsuo Ishii
Date:
Subject: Re: regexp character class locale awareness patch