Re: postgresql v7.1.3 bug report - Mailing list pgsql-bugs

From Tatsuo Ishii
Subject Re: postgresql v7.1.3 bug report
Date
Msg-id 20010905121700S.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: postgresql v7.1.3 bug report  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: postgresql v7.1.3 bug report  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
> "pierre" <cti848@www.textilenet.org.tw> writes:
> >     I make postgres 7.1.3 version in my linux system with --enable-multibyt=
> > e=3DEUC_TW, but=20
>
> >     I got some problem when I exec sql command below,  in chinese character=
> >  (CName ~* '=A6|'')  the chicode is 0xA67C  -> 0x7c is ascii '|" , I guess =
> > you system reject '|' this byte, but it was Big5 Code 2nd byte , How can I =
> > avoid this proble??
>
> > SELECT * FROM ifabinstn Where((CName ~* '=A6|') OR FALSE) ORDER BY CName
>
> > Warning: PostgreSQL query failed: ERROR: Invalid regular expression: empty =
> > expression or subexpression in DB/pgsql.php on line 163
> > ERROR: Invalid regular expression: empty expression or subexpression=20
>
>
> I am thinking that p_ere's local "char c" (regcomp.c, about line 304 in
> current sources) should have been declared "pg_wchar c".  Tatsuo, what
> do you think?  Are there any other places in this file where char should
> be pg_wchar?

I don't think so. The problem is he uses EUC_TW for backend encoding,
while he uses Big5 for frontend encoding. In this case he should
declare that client side encoding explicitly to let backend do the
encoding conversion. To acomplish this in php scripts, call:

pg_set_client_encoding($con, "BIG5");

before doing any query ($con is a connection to PostgreSQL).

Note that EUC_TW or any multibyte encodings that are allowed for
backend side, do not contain such ASCII special characters as "|" and
should be safe for the parser and the regexp routines.
--
Tatsuo Ishii

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: create view bug
Next
From: Tom Lane
Date:
Subject: Re: postgresql v7.1.3 bug report