BUG #1091: Localization in EUC_TW Can't decode Big5 0xFA40--0xFEF0. - Mailing list pgsql-bugs

From PostgreSQL Bugs List
Subject BUG #1091: Localization in EUC_TW Can't decode Big5 0xFA40--0xFEF0.
Date
Msg-id 20040304020847.E10A2CF4D3A@www.postgresql.com
Whole thread Raw
Responses Re: BUG #1091: Localization in EUC_TW Can't decode Big5
List pgsql-bugs
The following bug has been logged online:

Bug reference:      1091
Logged by:          yychen

Email address:      yychen@mail.clhs.tyc.edu.tw

PostgreSQL version: 7.4

Operating system:   MS-WIN2000(Run With TAIWAN Big5)

Description:        Localization in EUC_TW Can't decode Big5
0xFA40--0xFEF0.

Details:

In Localization:
 DataBase
 When i save string (with Big5 0xFA40-0xFEF0) to database (encodinig with
EUC_TW or UNICODE); and then read it.
But PostgreSQL Can't decode these.
According to:  ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf.
3.3.4: BIG FIVE

    Big Five is the encoding system used on machines that support
MS-DOS or Windows, and also for Macintosh (such as the Chinese
Language Kit or the fully-localized operating system).

  Two-byte Standard Characters                  Encoding Ranges
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^                  ^^^^^^^^^^^^^^^
  first byte range                              0xA1-0xFE
  second byte ranges                            0x40-0x7E, 0xA1-0xFE

  One-byte Characters                           Encoding Range
  ^^^^^^^^^^^^^^^^^^^                           ^^^^^^^^^^^^^^
  ASCII                                         0x21-0x7E

    The encoding used on Macintosh is quite similar to the above,
but has a slightly shortened two-byte range (second byte range up to
0xFC only) plus additional one-byte code points, namely 0x80
(backslash), 0xFD ("copyright" symbol: "c" in a circle), 0xFE
("trademark" symbol: "TM" as a superscript), and 0xFF ("ellipsis"
symbol: three dots).

pgsql-bugs by date:

Previous
From: Steve Atkins
Date:
Subject: Re: Integer parsing bug?
Next
From: Tatsuo Ishii
Date:
Subject: Re: BUG #1091: Localization in EUC_TW Can't decode Big5