Re: BUG #4714: Unicode Big5 Conversion - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #4714: Unicode Big5 Conversion
Date
Msg-id 49C0C28B.8050304@enterprisedb.com
Whole thread Raw
In response to BUG #4714: Unicode Big5 Conversion  ("Roger Chang" <rchang111@gmail.com>)
List pgsql-bugs
Roger Chang wrote:
> The following bug has been logged online:
>
> Bug reference:      4714
> Logged by:          Roger Chang
> Email address:      rchang111@gmail.com
> PostgreSQL version: 8.3
> Operating system:   All
> Description:        Unicode Big5 Conversion
> Details:
>
> This is NOT a bug. but cause some problem. Since long time and up to 8.3.7
> still no one to response.
>
> Chinese Big5/UTF8 Conversion-map
> big5_to_utf8.map
> utf8_to_big5.map
> have missing some char, don't know to who to talk to?
> Suffer doing source build every version upgrade.
>
> Please somebody can help and add following char map to above map. (+7
> char.)
>
>  {0xf9d6, 0xe7a281},
>  {0xf9d7, 0xe98ab9},
>  {0xf9d8, 0xe8a38f},
>  {0xf9d9, 0xe5a2bb},
>  {0xf9da, 0xe68192},
>  {0xf9db, 0xe7b2a7},
>  {0xf9dc, 0xe5abba}
>
> Thanks in Advance.
>
> Myself will like to help to do these job in future, feel need to do some
> help to PostgreSQL after using it for so many many years.

Thanks!

Looking up those Unicode characters in the Unihan database at
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=92B9&useutf8=true

doesn't give any mapping to the Big5 encoding. At the bottom, however,
there is a mapping to "kHKSCS", which matches the mapping you listed.

Looking at the wikipedia page for Big5, it seems that those characters
belong to Microsoft's ETEN extension. The page also claims that "The
ETen extension became part of the current Big5 standard through
popularity." Is that true? Do we support all the other characters in the
ETEN extension?

Is there some authoritative source for the Big5 encoding, to look these
things up?

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: log : bad file dscriptor????
Next
From: Tom Lane
Date:
Subject: Re: BUG #4715: libpq `PQgetlength' return invalid field length.