A Patch for MIC to EUC_TW code converting in mb support - Mailing list pgsql-patches

From Chih-Chang Hsieh
Subject A Patch for MIC to EUC_TW code converting in mb support
Date
Msg-id 3A0A07FA.129ED448@cc.kmu.edu.tw
Whole thread Raw
Responses Re: A Patch for MIC to EUC_TW code converting in mb support
List pgsql-patches
============================================================================

POSTGRESQL BUG REPORT: MIC to EUC_TW code converting in mb support
============================================================================

System Configuration
---------------------
  Architecture (example: Intel Pentium)         :x86
  Operating System (example: Linux 2.0.26 ELF)  :Linux 2.2.x and FreeBSD
3.5R
  PostgreSQL version (example: PostgreSQL-7.0)  :PostgreSQL-7.0.2
  Compiler used (example:  gcc 2.8.0)           :egcs-2.91.66, gcc 2.7.3

A FULL description of the problem:
------------------------------------------------
In PostgreSQL mb (multi-byte) support, there is a bug in code converting

for MIC to EUC_TW. Original mic2euc_tw() in conv.c converts CNS
11643-1992
Plane 2 into 2 bytes EUC_TW encoding. But characters in CNS 11643-1992
Plane 2
should be converted into 4 bytes EUC_TW encoding instead.

A way to repeat the problem:
----------------------------------------------------------------------
When you initdb with -E EUC_TW and set PGCLIENTENCODING to BIG5,
you will find all the characters in CNS 11643-1992 Plane 2 are
incorrectly stored or output.

This problem might be fixed by the solution in the attachement.

*** conv.c    Wed Nov  8 22:44:21 2000
--- conv.c.orig    Sat May 20 21:12:26 2000
***************
*** 906,920 ****
      {
          len -= pg_mic_mblen(mic++);

!         if (c1 == LC_CNS11643_1)
          {
-             *p++ = *mic++;
-             *p++ = *mic++;
-         }
-         else if (c1 == LC_CNS11643_2)
-         {
-             *p++ = SS2;
-             *p++ = 0xa2;
              *p++ = *mic++;
              *p++ = *mic++;
          }
--- 906,913 ----
      {
          len -= pg_mic_mblen(mic++);

!         if (c1 == LC_CNS11643_1 || c1 == LC_CNS11643_2)
          {
              *p++ = *mic++;
              *p++ = *mic++;
          }

pgsql-patches by date:

Previous
From: Zeugswetter Andreas SB
Date:
Subject: Patch to fix installed location dependency in AIX
Next
From: Bruce Momjian
Date:
Subject: Re: Patch to fix installed location dependency in AIX