Re: Invalid EUC_TW character sequence found - Mailing list pgsql-bugs

From Tatsuo Ishii
Subject Re: Invalid EUC_TW character sequence found
Date
Msg-id 20020626.124206.102120976.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: Invalid EUC_TW character sequence found  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-bugs
> To me, the third insert is a character that display correctly in my application,
> I do not see any problem.  And I do not know and can not tell how to check that
> 'xx' is not a correct ECU_TW character.   Please give me some hint for checking,
> thanks!!

Ok, here are some rules to verify EUC_TW characters:

(1) if the first byte is 0x8e, then the 8th bit of following three
    bytes must be set

(2) else if the first byte is 0x8f, then the 8th bit of following two
    bytes must be set

(3) else if the 8th bit of the first byte is set, then the 8th bit of
    following one bytes must be set

(4) else (that means the 8th bit of the first byte is not set) then
    that must be an ASCII character.

Apparently 0xa672 does not satisfy all of above.
--
Tatsuo Ishii

pgsql-bugs by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Invalid EUC_TW character sequence found
Next
From: pgsql-bugs@postgresql.org
Date:
Subject: Bug #699: pg_dump not reporting correct start value for sequence