Thread: Possible locale issue with 7.4

Possible locale issue with 7.4

From
Bruno Wolff III
Date:
In 7.4 I am finding that '(' (and some other punctuation) is not a member of
[:print:]. It is in 7.3.  It is a member of [:graph:] in 7.4 (which is
supposed to be [:print:] - [:space:]).

The following is my 7.4 config:
./configure --prefix=/usr/local/pgsql --enable-integer-datetimes --with-pgport=5433

For 7.3 I used:
./configure --prefix=/usr/lib/pgsql --exec-prefix=/usr --with-perl --with-openssl --mandir=/usr/man --docdir=/usr/doc
--enable-integer-datetimes

The following is an example of the problem:
area=> select version();                               version                                 
------------------------------------------------------------------------PostgreSQL 7.4beta3 on i686-pc-linux-gnu,
compiledby GCC egcs-2.91.66
 
(1 row)

area=> select '(' ~ '[[:print:]]';?column? 
----------f
(1 row)

area=> select '(' ~ '[[:graph:]]';?column? 
----------t
(1 row)

area=> select '0' ~ '[[:print:]]';?column? 
----------t
(1 row)


Re: Possible locale issue with 7.4

From
Tom Lane
Date:
Bruno Wolff III <bruno@wolff.to> writes:
> In 7.4 I am finding that '(' (and some other punctuation) is not a member of
> [:print:]. It is in 7.3.  It is a member of [:graph:] in 7.4 (which is
> supposed to be [:print:] - [:space:]).

This is not a locale problem, because I see it in C locale too.
[digs]  Apparently this is an oversight in the new regex code we 
lifted from Tcl 8.4.1:
   switch ((enum classes) index)   {       case CC_PRINT:       case CC_ALNUM:           cv = getcvec(v, UCHAR_MAX, 1,
0);          if (cv)           {               for (i = 0; i <= UCHAR_MAX; i++)               {                   if
(pg_isalpha((chr)i))                       addchr(cv, (chr) i);               }               addrange(cv, (chr) '0',
(chr)'9');           }           break;
 

in other words, :print: is the same as :alnum:.  This is obviously
a bug, will fix ... wonder if Henry Spencer knows about it?
        regards, tom lane


Re: Possible locale issue with 7.4

From
Alvaro Herrera
Date:
On Sun, Sep 28, 2003 at 08:09:31PM -0400, Tom Lane wrote:
> Bruno Wolff III <bruno@wolff.to> writes:
> > In 7.4 I am finding that '(' (and some other punctuation) is not a member of
> > [:print:]. It is in 7.3.  It is a member of [:graph:] in 7.4 (which is
> > supposed to be [:print:] - [:space:]).
> 
> This is not a locale problem, because I see it in C locale too.
> [digs]  Apparently this is an oversight in the new regex code we 
> lifted from Tcl 8.4.1:

Here
http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/generic/regc_locale.c?rev=1.10&view=auto

is the Tcl version.  Is looks very similar (meaning, :print: is the
same as :alnum:).  Note that the code hasn't changed since
Mon Jul 29 12:27:51 2002 UTC

but is marked with tags to version 8.4.4.

Maybe not too much people uses :print: ?

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"A wizard is never late, Frodo Baggins, nor is he early.
He arrives precisely when he means to."  (Gandalf, en LoTR FoTR)


Re: Possible locale issue with 7.4

From
Bruno Wolff III
Date:
On Sun, Sep 28, 2003 at 20:09:31 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
> in other words, :print: is the same as :alnum:.  This is obviously
> a bug, will fix ... wonder if Henry Spencer knows about it?

The really cute thing is I only found it because I made a mistake.
I didn't want to include spaces in what I was using it for and really
should have been using [:graph:] instead.