Re: Chinese GB18030 support is implemented! - Mailing list pgsql-patches
From | Bill Huang |
---|---|
Subject | Re: Chinese GB18030 support is implemented! |
Date | |
Msg-id | 3D084D76.8040509@ybb.ne.jp Whole thread Raw |
In response to | Chinese GB18030 support is implemented! (Bill Huang <bill_huanghb@ybb.ne.jp>) |
Responses |
Re: Chinese GB18030 support is implemented!
|
List | pgsql-patches |
Hi Ishii-san, The patches are attached.Please apply it. Thanks, Bill Tatsuo Ishii wrote: >>Hello, >> >>As postgresql is widely used in the world,many Chinese users are looking >>forward to use such a high performanced database management >>system.However since the Chinese new codepage standard GB18030 is not >>completely supported,postgresql is limitted to be used in China. >> >>Now I have managed to implement the GB18030 support upon the latest >>version,so the following functions are added after the patches are added. >> >>-Chinese GB18030 encoding is available on front-end side,while on >>backend side,EUC_CN or MIC is used. >>-Encoding convertion between MIC and GB18030 is implement. >>-GB18030 locale support is available on front-end side. >>-GB18030 locale test is added. >> >>Any help for testing with these patches and sugguestions for GB18030 >>support are greatly appreciated. >> > >We need to apply your pacthes to the current source tree(we are not >allowed to add new feature stable source tree). Your pacthes for >encnames.c pg_wchar.h and wchar.c are rejected due to the difference >between 7.2 and current. > >Can you give me patches encnames.c pg_wchar.h and wchar.c against >current? > >Unicode conversion map staffs ISO10646-GB18030.TXT utf8_to_gb18030.map >UCS_to_GB18030.pl and gb18030_to_utf8.map are looks good for >current. So I will apply them. >-- >Tatsuo Ishii > -- /---------------------------/ (Bill Huang) E-mail:bill_huanghb@ybb.ne.jp Cell phone:090-9979-4631 /---------------------------/ --- pgsql/src/backend/utils/mb/encnames.c.org Thu Jun 13 16:15:54 2002 +++ pgsql/src/backend/utils/mb/encnames.c Thu Jun 13 16:26:42 2002 @@ -239,6 +239,9 @@ { "windows950", PG_BIG5 }, /* alias for BIG5 */ + { + "gb18030", PG_GB18030 + }, /* GB18030;GB18030 */ { NULL, 0 @@ -353,6 +356,9 @@ }, { "WIN1250", PG_WIN1250 + }, + { + "gb18030", PG_GB18030 } }; --- pgsql/src/include/mb/pg_wchar.h.org Thu Jun 13 16:39:41 2002 +++ pgsql/src/include/mb/pg_wchar.h Thu Jun 13 16:43:47 2002 @@ -190,6 +190,7 @@ PG_UHC, /* UHC (Windows-949) */ PG_WIN1250, /* windows-1250 */ + PG_GB18030, /* GB18030 */ _PG_LAST_ENCODING_ /* mark only */ } pg_enc; --- pgsql/src/backend/utils/mb/wchar.c.org Thu Jun 13 16:37:06 2002 +++ pgsql/src/backend/utils/mb/wchar.c Thu Jun 13 16:33:17 2002 @@ -510,6 +510,31 @@ return (len); } +/* + * * GB18030 + * * Added by Bill Huang <bhuang@redhat.com>,<bill_huanghb@ybb.ne.jp> + * */ +static int +pg_gb18030_mblen(const unsigned char *s) +{ + int len; + if (*s <= 0x7f) + { /* ASCII */ + len = 1; + } + else + { + if((*(s+1) >= 0x40 && *(s+1) <= 0x7e)|| (*(s+1) >= 0x80 && *(s+1) <= 0xfe)) + len = 2; + else if(*(s+1) >= 0x30 && *(s+1) <= 0x39) + len = 4; + else + len = 2; + } + return (len); +} + + pg_wchar_tbl pg_wchar_table[] = { {pg_ascii2wchar_with_len, pg_ascii_mblen, 1}, /* 0; PG_SQL_ASCII */ {pg_eucjp2wchar_with_len, pg_eucjp_mblen, 3}, /* 1; PG_EUC_JP */ @@ -544,6 +569,7 @@ {0, pg_gbk_mblen, 2}, /* 30; PG_GBK */ {0, pg_uhc_mblen, 2}, /* 31; PG_UHC */ {pg_latin12wchar_with_len, pg_latin1_mblen, 1}, /* 32; PG_WIN1250 */ + {0, pg_gb18030_mblen, 2} /* 33; PG_GB18030 */ }; /* returns the byte length of a word for mule internal code */
pgsql-patches by date: