Home > mailing lists

Re: regexp character class locale awareness patch - Mailing list pgsql-hackers

From	Tatsuo Ishii
Subject	Re: regexp character class locale awareness patch
Date	April 15, 2002 22:44:01
Msg-id	20020416114247T.t-ishii@sra.co.jp Whole thread Raw
In response to	Re: regexp character class locale awareness patch (Manuel Sugawara <masm@fciencias.unam.mx>)
Responses	Re: regexp character class locale awareness patch
List	pgsql-hackers

Tree view

> According to POSIX -regex (7)-, standard character class are:
> 
>               alnum       digit       punct
>               alpha       graph       space
>               blank       lower       upper
>               cntrl       print       xdigi
> 
> Many of that classes are different in different locales, and currently
> all work as if the localization were C. Many of those tests have
> multibyte issues, however with the patch postgres will work for
> one-byte encondings, which is better than nothing. If someone
> (Tatsuo?) gives some advice I will work in the multibyte version.

I don't think character classes are applicable for most mutibyte
encodings. Maybe only the exeception is Unicode?

> Peter Eisentraut <peter_e@gmx.net> writes:
> >
> > Basically, you manually preprocess the patch to include the
> > USE_LOCALE branch and remove the not USE_LOCALE branch.
> 
> Yeah, that should work. You may also remove include/regex/cclass.h
> since it will not be used any more.

But I don't like cclass_init() routine runs every time when reg_comp
called. In my understanding the result of cclass_init() is always
same. What about running cclass_init() in postmaster, not postgres? Or
even better in initdb time?
--
Tatsuo Ishii

pgsql-hackers by date:

From: Manuel Sugawara
Date: 15 April 2002, 22:33:10
Subject: Re: regexp character class locale awareness patch

From: Thomas Lockhart
Date: 15 April 2002, 22:46:04
Subject: Re: [PATCHES] ANSI Compliant Inserts

Re: regexp character class locale awareness patch - Mailing list pgsql-hackers

Previous

Next