Thread: Possible locale issue with 7.4
In 7.4 I am finding that '(' (and some other punctuation) is not a member of [:print:]. It is in 7.3. It is a member of [:graph:] in 7.4 (which is supposed to be [:print:] - [:space:]). The following is my 7.4 config: ./configure --prefix=/usr/local/pgsql --enable-integer-datetimes --with-pgport=5433 For 7.3 I used: ./configure --prefix=/usr/lib/pgsql --exec-prefix=/usr --with-perl --with-openssl --mandir=/usr/man --docdir=/usr/doc --enable-integer-datetimes The following is an example of the problem: area=> select version(); version ------------------------------------------------------------------------PostgreSQL 7.4beta3 on i686-pc-linux-gnu, compiledby GCC egcs-2.91.66 (1 row) area=> select '(' ~ '[[:print:]]';?column? ----------f (1 row) area=> select '(' ~ '[[:graph:]]';?column? ----------t (1 row) area=> select '0' ~ '[[:print:]]';?column? ----------t (1 row)
Bruno Wolff III <bruno@wolff.to> writes: > In 7.4 I am finding that '(' (and some other punctuation) is not a member of > [:print:]. It is in 7.3. It is a member of [:graph:] in 7.4 (which is > supposed to be [:print:] - [:space:]). This is not a locale problem, because I see it in C locale too. [digs] Apparently this is an oversight in the new regex code we lifted from Tcl 8.4.1: switch ((enum classes) index) { case CC_PRINT: case CC_ALNUM: cv = getcvec(v, UCHAR_MAX, 1, 0); if (cv) { for (i = 0; i <= UCHAR_MAX; i++) { if (pg_isalpha((chr)i)) addchr(cv, (chr) i); } addrange(cv, (chr) '0', (chr)'9'); } break; in other words, :print: is the same as :alnum:. This is obviously a bug, will fix ... wonder if Henry Spencer knows about it? regards, tom lane
On Sun, Sep 28, 2003 at 08:09:31PM -0400, Tom Lane wrote: > Bruno Wolff III <bruno@wolff.to> writes: > > In 7.4 I am finding that '(' (and some other punctuation) is not a member of > > [:print:]. It is in 7.3. It is a member of [:graph:] in 7.4 (which is > > supposed to be [:print:] - [:space:]). > > This is not a locale problem, because I see it in C locale too. > [digs] Apparently this is an oversight in the new regex code we > lifted from Tcl 8.4.1: Here http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/generic/regc_locale.c?rev=1.10&view=auto is the Tcl version. Is looks very similar (meaning, :print: is the same as :alnum:). Note that the code hasn't changed since Mon Jul 29 12:27:51 2002 UTC but is marked with tags to version 8.4.4. Maybe not too much people uses :print: ? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "A wizard is never late, Frodo Baggins, nor is he early. He arrives precisely when he means to." (Gandalf, en LoTR FoTR)
On Sun, Sep 28, 2003 at 20:09:31 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > in other words, :print: is the same as :alnum:. This is obviously > a bug, will fix ... wonder if Henry Spencer knows about it? The really cute thing is I only found it because I made a mistake. I didn't want to include spaces in what I was using it for and really should have been using [:graph:] instead.