Re: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in - Mailing list pgsql-bugs

From Tatsuo Ishii
Subject Re: [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in
Date
Msg-id 20020627.000015.85413275.t-ishii@sra.co.jp
Whole thread Raw
In response to [Fwd: [GENERAL] [Please Help!!!!!!!!] Problem in Chinese (Big5)!!! Version 7.2.1 (come with Redhat 7.3)]  (Gordon Luk <gordon@gforce.ods.org>)
List pgsql-bugs
> I am runing redhat 7.3, and install the postgresql 7.2.1 from Redhat CD.
> I try to create a new database encode with EUC_TW... it should be
> support chinese (Big5). And then i use Pgadmin II to input chinese
> character "¤¤¤å¦r" , it reject me... like following :
>
> ERROR : Invalid EUC_TW character sequence found (0xa672)....
>
> when i input "¤¤¤å" , it fine... i know the problem in the chinese
> character "¦r"... but the character just normal ... just like in english
> "A", "B", "C", not a special character in chinese.... i have try more
> more chinese word with different encode.. like unicode, euc_cn..and
> more... also reject me... "invalid.... character sequece...".
>
>
> Anyone experience about case.... how to solve the problem ? Please help,
> thanks.

Honestly I'm tired of this kind of complains. Please verify your
"correct" EUC_TW character sequences first.  ¤¤¤å¦r"
cannot be correct EUC_TW at all. I have already shown
Gene Leung "rules to verify your EUC_TW character sequences".
See followings.

BTW, I have no idea what Pgadmin II is. Are you sure that it supports
EUC_TW? I suspect it only supports Big5. (EUC_TW and Big5 are
completely different beasts).

---------------------------------------------------------------
Ok, here are some rules to verify EUC_TW characters:

(1) if the first byte is 0x8e, then the 8th bit of following three
    bytes must be set

(2) else if the first byte is 0x8f, then the 8th bit of following two
    bytes must be set

(3) else if the 8th bit of the first byte is set, then the 8th bit of
    following one bytes must be set

(4) else (that means the 8th bit of the first byte is not set) then
    that must be an ASCII character.

Apparently 0xa672 does not satisfy all of above.

--
Tatsuo Ishii

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: view OIDs
Next
From: Webb Sprague
Date:
Subject: Re: "Field is too small"