Thread: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in Chinese (Big5)!!! Version 7.2.1 (come with Redhat 7.3)]

-------- Original Message --------
Subject: [GENERAL] [Please Help!!!!!!!!] Problem in Chinese (Big5)!!!
Version 7.2.1 (come with Redhat 7.3)
Date: Tue, 25 Jun 2002 11:03:17 +0800
From: Gordon Luk <gordon@gforce.ods.org>
To: pgsql-general@postgresql.org



Hi all,

I am runing redhat 7.3, and install the postgresql 7.2.1 from Redhat CD.
I try to create a new database encode with EUC_TW... it should be
support chinese (Big5). And then i use Pgadmin II to input chinese
character "¤¤¤å¦r" , it reject me... like following :

ERROR : Invalid EUC_TW character sequence found (0xa672)....

when i input "¤¤¤å" , it fine... i know the problem in the chinese
character "¦r"... but the character just normal ... just like in english
"A", "B", "C", not a special character in chinese.... i have try more
more chinese word with different encode.. like unicode, euc_cn..and
more... also reject me... "invalid.... character sequece...".


Anyone experience about case.... how to solve the problem ? Please help,
thanks.

Gordon

PS: In verison 7.1.3, it work fine with EUC_TW, now, i still could not
restore to 7.2.1... :-(




_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com




---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in

From
Tatsuo Ishii
Date:
> I am runing redhat 7.3, and install the postgresql 7.2.1 from Redhat CD.
> I try to create a new database encode with EUC_TW... it should be
> support chinese (Big5). And then i use Pgadmin II to input chinese
> character "¤¤¤å¦r" , it reject me... like following :
>
> ERROR : Invalid EUC_TW character sequence found (0xa672)....
>
> when i input "¤¤¤å" , it fine... i know the problem in the chinese
> character "¦r"... but the character just normal ... just like in english
> "A", "B", "C", not a special character in chinese.... i have try more
> more chinese word with different encode.. like unicode, euc_cn..and
> more... also reject me... "invalid.... character sequece...".
>
>
> Anyone experience about case.... how to solve the problem ? Please help,
> thanks.

Honestly I'm tired of this kind of complains. Please verify your
"correct" EUC_TW character sequences first.  ¤¤¤å¦r"
cannot be correct EUC_TW at all. I have already shown
Gene Leung "rules to verify your EUC_TW character sequences".
See followings.

BTW, I have no idea what Pgadmin II is. Are you sure that it supports
EUC_TW? I suspect it only supports Big5. (EUC_TW and Big5 are
completely different beasts).

---------------------------------------------------------------
Ok, here are some rules to verify EUC_TW characters:

(1) if the first byte is 0x8e, then the 8th bit of following three
    bytes must be set

(2) else if the first byte is 0x8f, then the 8th bit of following two
    bytes must be set

(3) else if the 8th bit of the first byte is set, then the 8th bit of
    following one bytes must be set

(4) else (that means the 8th bit of the first byte is not set) then
    that must be an ASCII character.

Apparently 0xa672 does not satisfy all of above.

--
Tatsuo Ishii

Re: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in

From
Tatsuo Ishii
Date:
> Ok, but problem is, when i try encode with unicode, it also reject
> me.... invalid UNICODE charater.... :-(

Show me the entire error message please.

> I already try few client, like borland's SQL explorer, zde... and
> restore program come with postgresql...
>
> Sorry, i would like to a special request. After i read preious message
> from you to Gene Leung, let me fully understand under EUC_TW rule ,
> postgresql should reject me (input such invalid charaters). So i request
> a special patch that could to support Big5 or disable the validation.
>
> If postgres do not support Big5, that is big problem in chinese...
> Please help.

Actually PostgreSQL does support Big5. To use Big5, set the client
encoding to Big5 and set the server(DB) encoding to EUC_TW. PostgreSQL
will take care of the conversion between Big5 and EUC_TW.

There are several ways to set the client encoding to Big5:

SQL: set client_encoding to 'Big5';
from psql: \encoding Big5
using environment variable: export PGCLIENTENCODING=Big5 (example for bash)

Hope this helps,
--
Tatsuo Ishii

Re: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in Chinese

From
Gordon Luk
Date:
Hi Tatsue Ishii,

Ok, but problem is, when i try encode with unicode, it also reject
me.... invalid UNICODE charater.... :-(
I already try few client, like borland's SQL explorer, zde... and
restore program come with postgresql...

Sorry, i would like to a special request. After i read preious message
from you to Gene Leung, let me fully understand under EUC_TW rule ,
postgresql should reject me (input such invalid charaters). So i request
a special patch that could to support Big5 or disable the validation.

If postgres do not support Big5, that is big problem in chinese...
Please help.

Thanks for your quick response ( i have not response in General Mailling
list :-( , may be no one using postgresql in chinese [Big5].)

Gordon

Tatsuo Ishii wrote:

>Honestly I'm tired of this kind of complains. Please verify your
>"correct" EUC_TW character sequences first.  ¤¤¤å¦r"
>cannot be correct EUC_TW at all. I have already shown
>Gene Leung "rules to verify your EUC_TW character sequences".
>See followings.
>
>BTW, I have no idea what Pgadmin II is. Are you sure that it supports
>EUC_TW? I suspect it only supports Big5. (EUC_TW and Big5 are
>completely different beasts).
>
>---------------------------------------------------------------
>Ok, here are some rules to verify EUC_TW characters:
>
>(1) if the first byte is 0x8e, then the 8th bit of following three
>    bytes must be set
>
>(2) else if the first byte is 0x8f, then the 8th bit of following two
>    bytes must be set
>
>(3) else if the 8th bit of the first byte is set, then the 8th bit of
>    following one bytes must be set
>
>(4) else (that means the 8th bit of the first byte is not set) then
>    that must be an ASCII character.
>
>Apparently 0xa672 does not satisfy all of above.
>
>--
>Tatsuo Ishii
>
>




_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

Re: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in Chinese

From
Gordon Luk
Date:
Tatsuo Ishii wrote:

>>Ok, but problem is, when i try encode with unicode, it also reject
>>me.... invalid UNICODE charater.... :-(
>>
>>
>
>Show me the entire error message please.
>

Ok... error like this...
ERROR : Invalid UNICODE character sequence found (0xe5a672)...

the input charater also "¤¤¤å¦r"....

>Actually PostgreSQL does support Big5. To use Big5, set the client
>encoding to Big5 and set the server(DB) encoding to EUC_TW. PostgreSQL
>will take care of the conversion between Big5 and EUC_TW.
>
>There are several ways to set the client encoding to Big5:
>
>SQL: set client_encoding to 'Big5';
>from psql: \encoding Big5
>using environment variable: export PGCLIENTENCODING=Big5 (example for bash)
>
>Hope this helps,
>--
>Tatsuo Ishii
>
>
O... you are right, i use Pgadmin II , and type SQL by hand ... and add
the "set client_encoding to 'Big5';" before insert statement.... It
WORK!!!! Thanks...


Gordon