Thread: Fwd: Re: Re: patch to support KOI8-U <==> UTF-8 conversions (2nd try)

Fwd: Re: Re: patch to support KOI8-U <==> UTF-8 conversions (2nd try)

From
Andy Rysin
Date:
oops, last one I sent directly to Bruce (obviously today is not my
day :-)

just in case forwarding it to the list.

Sorry for the mess,
Andy

Note: forwarded message attached.


__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/
may be it's just slipped somewhere on the road? :-)
well, it's probably me who forget to attach it, here it is :-)
BTW it does not add encodign it just patches existing one (KOI8) to
support two - KOI8-R and KOI8-U (latter is superset of the former if
not to take to the account pseudographics)

Thanks,
Andriy

--- Bruce Momjian <pgman@candle.pha.pa.us> wrote:
>
> I don't see no patch.  :-)
>
> > well, it seems like it's OK to attach patches here, so here's my
> > patch...
> >
> > --- Andy Rysin <arysin@yahoo.com> wrote:
> > > Hello everybuddy,
> > >
> > > I created the patch to add support of KOI8-U (ukrainian)
> encoding
> > > to
> > > PostgreSQL. It handles Unicode conversions. Actually KOI8-U
> adds 4
> > > pairs of letters to KOI8-R and it was already there in
> single-byte
> > > recodings, do I didn't create new encoding and just patched
> 'KOI8'
> > > one.
> > >
> > > Shall I attach it right in my email to this list? It's around
> 6K.
> > >
> > > Thanks in advance,
> > > Andriy
> > >
> > > __________________________________________________
> > > Do You Yahoo!?
> > > Yahoo! Auctions - buy the things you want at great prices
> > > http://auctions.yahoo.com/
> > >
> >
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Yahoo! Auctions - buy the things you want at great prices
> > http://auctions.yahoo.com/
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania
19026


__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/

Attachment

Re: Fwd: Re: Re: patch to support KOI8-U <==> UTF-8 conversions (2nd try)

From
Bruce Momjian
Date:
Here is the actual patch I applied.  Seems to only affect Korean.  I did
not apply this part of the patch:

    +//#ifndef HAVE_KOI8_U_IN_JDK
               dbEncoding = "KOI8_R";
    +//#else
    +// if you have KOI8_U conversion classes - we have to put it as parameter in 'congigure'
    +//          dbEncoding = "KOI8_U";
    +//#endif

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
Index: doc/src/sgml/charset.sgml
===================================================================
RCS file: /home/projects/pgsql/cvsroot/pgsql/doc/src/sgml/charset.sgml,v
retrieving revision 2.7
diff -u -3 -p -u -r2.7 charset.sgml
--- doc/src/sgml/charset.sgml    2001/04/20 15:52:33    2.7
+++ doc/src/sgml/charset.sgml    2001/05/03 18:24:13
@@ -353,7 +353,7 @@ perl: warning: Falling back to the stand
     </row>
     <row>
      <entry>KOI8</entry>
-     <entry>KOI8-R</entry>
+     <entry>KOI8-R(U)</entry>
     </row>
     <row>
      <entry>WIN</entry>
Index: src/backend/utils/mb/Unicode/KOI8_to_utf8.map
===================================================================
RCS file: /home/projects/pgsql/cvsroot/pgsql/src/backend/utils/mb/Unicode/KOI8_to_utf8.map,v
retrieving revision 1.1
diff -u -3 -p -u -r1.1 KOI8_to_utf8.map
--- src/backend/utils/mb/Unicode/KOI8_to_utf8.map    2001/04/29 07:27:38    1.1
+++ src/backend/utils/mb/Unicode/KOI8_to_utf8.map    2001/05/03 18:24:14
@@ -35,32 +35,32 @@ static pg_local_to_utf LUmapKOI8[ 128 ]
   {0x00a1, 0xe29591},
   {0x00a2, 0xe29592},
   {0x00a3, 0xd191},
-  {0x00a4, 0xe29593},
+  {0x00a4, 0xd194},
   {0x00a5, 0xe29594},
-  {0x00a6, 0xe29595},
-  {0x00a7, 0xe29596},
+  {0x00a6, 0xd196},
+  {0x00a7, 0xd197},
   {0x00a8, 0xe29597},
   {0x00a9, 0xe29598},
   {0x00aa, 0xe29599},
   {0x00ab, 0xe2959a},
   {0x00ac, 0xe2959b},
-  {0x00ad, 0xe2959c},
+  {0x00ad, 0xd291},
   {0x00ae, 0xe2959d},
   {0x00af, 0xe2959e},
   {0x00b0, 0xe2959f},
   {0x00b1, 0xe295a0},
   {0x00b2, 0xe295a1},
   {0x00b3, 0xd081},
-  {0x00b4, 0xe295a2},
+  {0x00b4, 0xd084},
   {0x00b5, 0xe295a3},
-  {0x00b6, 0xe295a4},
-  {0x00b7, 0xe295a5},
+  {0x00b6, 0xd086},
+  {0x00b7, 0xd087},
   {0x00b8, 0xe295a6},
   {0x00b9, 0xe295a7},
   {0x00ba, 0xe295a8},
   {0x00bb, 0xe295a9},
   {0x00bc, 0xe295aa},
-  {0x00bd, 0xe295ab},
+  {0x00bd, 0xd290},
   {0x00be, 0xe295ac},
   {0x00bf, 0xc2a9},
   {0x00c0, 0xd18e},
Index: src/backend/utils/mb/Unicode/utf8_to_KOI8.map
===================================================================
RCS file: /home/projects/pgsql/cvsroot/pgsql/src/backend/utils/mb/Unicode/utf8_to_KOI8.map,v
retrieving revision 1.1
diff -u -3 -p -u -r1.1 utf8_to_KOI8.map
--- src/backend/utils/mb/Unicode/utf8_to_KOI8.map    2001/04/29 07:27:38    1.1
+++ src/backend/utils/mb/Unicode/utf8_to_KOI8.map    2001/05/03 18:24:14
@@ -6,6 +6,9 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
   {0xc2b7, 0x009e},
   {0xc3b7, 0x009f},
   {0xd081, 0x00b3},
+  {0xd084, 0x00b4},
+  {0xd086, 0x00b6},
+  {0xd087, 0x00b7},
   {0xd090, 0x00e1},
   {0xd091, 0x00e2},
   {0xd092, 0x00f7},
@@ -71,6 +74,11 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
   {0xd18e, 0x00c0},
   {0xd18f, 0x00d1},
   {0xd191, 0x00a3},
+  {0xd194, 0x00a4},
+  {0xd196, 0x00a6},
+  {0xd197, 0x00a7},
+  {0xd290, 0x00bd},
+  {0xd291, 0x00ad},
   {0xe28899, 0x0095},
   {0xe2889a, 0x0096},
   {0xe28988, 0x0097},
@@ -92,31 +100,23 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
   {0xe29590, 0x00a0},
   {0xe29591, 0x00a1},
   {0xe29592, 0x00a2},
-  {0xe29593, 0x00a4},
   {0xe29594, 0x00a5},
-  {0xe29595, 0x00a6},
-  {0xe29596, 0x00a7},
   {0xe29597, 0x00a8},
   {0xe29598, 0x00a9},
   {0xe29599, 0x00aa},
   {0xe2959a, 0x00ab},
   {0xe2959b, 0x00ac},
-  {0xe2959c, 0x00ad},
   {0xe2959d, 0x00ae},
   {0xe2959e, 0x00af},
   {0xe2959f, 0x00b0},
   {0xe295a0, 0x00b1},
   {0xe295a1, 0x00b2},
-  {0xe295a2, 0x00b4},
   {0xe295a3, 0x00b5},
-  {0xe295a4, 0x00b6},
-  {0xe295a5, 0x00b7},
   {0xe295a6, 0x00b8},
   {0xe295a7, 0x00b9},
   {0xe295a8, 0x00ba},
   {0xe295a9, 0x00bb},
   {0xe295aa, 0x00bc},
-  {0xe295ab, 0x00bd},
   {0xe295ac, 0x00be},
   {0xe29680, 0x008b},
   {0xe29684, 0x008c},
Index: src/include/mb/pg_wchar.h
===================================================================
RCS file: /home/projects/pgsql/cvsroot/pgsql/src/include/mb/pg_wchar.h,v
retrieving revision 1.25
diff -u -3 -p -u -r1.25 pg_wchar.h
--- src/include/mb/pg_wchar.h    2001/03/22 04:00:49    1.25
+++ src/include/mb/pg_wchar.h    2001/05/03 18:24:16
@@ -28,7 +28,7 @@
 #define LATIN7 13                /* ISO-8859 Latin 7 */
 #define LATIN8 14                /* ISO-8859 Latin 8 */
 #define LATIN9 15                /* ISO-8859 Latin 9 */
-#define KOI8   16                /* KOI8-R */
+#define KOI8   16                /* KOI8-R/U */
 #define WIN    17                /* windows-1251 */
 #define ALT    18                /* Alternativny Variant (MS-DOS CP866) */
 /* followings are for client encoding only */
@@ -68,6 +68,7 @@ typedef unsigned int pg_wchar;
 #define LC_JISX0201K    0x89    /* Japanese 1 byte kana */
 #define LC_JISX0201R    0x8a    /* Japanese 1 byte Roman */
 #define LC_KOI8_R    0x8c        /* Cyrillic KOI8-R */
+#define LC_KOI8_U    0x8c        /* Cyrillic KOI8-U */
 #define LC_GB2312_80    0x91    /* Chinese */
 #define LC_JISX0208 0x92        /* Japanese Kanji */
 #define LC_KS5601    0x93        /* Korean */
Index: src/interfaces/odbc/multibyte.h
===================================================================
RCS file: /home/projects/pgsql/cvsroot/pgsql/src/interfaces/odbc/multibyte.h,v
retrieving revision 1.3
diff -u -3 -p -u -r1.3 multibyte.h
--- src/interfaces/odbc/multibyte.h    2001/03/27 04:00:54    1.3
+++ src/interfaces/odbc/multibyte.h    2001/05/03 18:24:21
@@ -21,7 +21,7 @@
 #define LATIN7                13    /* ISO-8859 Latin 7 */
 #define LATIN8                14    /* ISO-8859 Latin 8 */
 #define LATIN9                15    /* ISO-8859 Latin 9 */
-#define KOI8                16    /* KOI8-R */
+#define KOI8                16    /* KOI8-R/U */
 #define WIN                    17    /* windows-1251 */
 #define ALT                    18    /* Alternativny Variant (MS-DOS CP866) */
 #define SJIS                32    /* Shift JIS */

Great but what do you mean by 'affect Korean'???

Andy

--- Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> Here is the actual patch I applied.  Seems to only affect Korean.
> I did
> not apply this part of the patch:
>
>  +//#ifndef HAVE_KOI8_U_IN_JDK
>             dbEncoding = "KOI8_R";
>  +//#else
>  +// if you have KOI8_U conversion classes - we have to put it as
> parameter in 'congigure'
>  +//          dbEncoding = "KOI8_U";
>  +//#endif
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania
> 19026
> > Index: doc/src/sgml/charset.sgml
> ===================================================================
> RCS file:
> /home/projects/pgsql/cvsroot/pgsql/doc/src/sgml/charset.sgml,v
> retrieving revision 2.7
> diff -u -3 -p -u -r2.7 charset.sgml
> --- doc/src/sgml/charset.sgml 2001/04/20 15:52:33 2.7
> +++ doc/src/sgml/charset.sgml 2001/05/03 18:24:13
> @@ -353,7 +353,7 @@ perl: warning: Falling back to the stand
>   </row>
>   <row>
>    <entry>KOI8</entry>
> -  <entry>KOI8-R</entry>
> +  <entry>KOI8-R(U)</entry>
>   </row>
>   <row>
>    <entry>WIN</entry>
> Index: src/backend/utils/mb/Unicode/KOI8_to_utf8.map
> ===================================================================
> RCS file:
>
/home/projects/pgsql/cvsroot/pgsql/src/backend/utils/mb/Unicode/KOI8_to_utf8.map,v
> retrieving revision 1.1
> diff -u -3 -p -u -r1.1 KOI8_to_utf8.map
> --- src/backend/utils/mb/Unicode/KOI8_to_utf8.map 2001/04/29
> 07:27:38 1.1
> +++ src/backend/utils/mb/Unicode/KOI8_to_utf8.map 2001/05/03
> 18:24:14
> @@ -35,32 +35,32 @@ static pg_local_to_utf LUmapKOI8[ 128 ]
>    {0x00a1, 0xe29591},
>    {0x00a2, 0xe29592},
>    {0x00a3, 0xd191},
> -  {0x00a4, 0xe29593},
> +  {0x00a4, 0xd194},
>    {0x00a5, 0xe29594},
> -  {0x00a6, 0xe29595},
> -  {0x00a7, 0xe29596},
> +  {0x00a6, 0xd196},
> +  {0x00a7, 0xd197},
>    {0x00a8, 0xe29597},
>    {0x00a9, 0xe29598},
>    {0x00aa, 0xe29599},
>    {0x00ab, 0xe2959a},
>    {0x00ac, 0xe2959b},
> -  {0x00ad, 0xe2959c},
> +  {0x00ad, 0xd291},
>    {0x00ae, 0xe2959d},
>    {0x00af, 0xe2959e},
>    {0x00b0, 0xe2959f},
>    {0x00b1, 0xe295a0},
>    {0x00b2, 0xe295a1},
>    {0x00b3, 0xd081},
> -  {0x00b4, 0xe295a2},
> +  {0x00b4, 0xd084},
>    {0x00b5, 0xe295a3},
> -  {0x00b6, 0xe295a4},
> -  {0x00b7, 0xe295a5},
> +  {0x00b6, 0xd086},
> +  {0x00b7, 0xd087},
>    {0x00b8, 0xe295a6},
>    {0x00b9, 0xe295a7},
>    {0x00ba, 0xe295a8},
>    {0x00bb, 0xe295a9},
>    {0x00bc, 0xe295aa},
> -  {0x00bd, 0xe295ab},
> +  {0x00bd, 0xd290},
>    {0x00be, 0xe295ac},
>    {0x00bf, 0xc2a9},
>    {0x00c0, 0xd18e},
> Index: src/backend/utils/mb/Unicode/utf8_to_KOI8.map
> ===================================================================
> RCS file:
>
/home/projects/pgsql/cvsroot/pgsql/src/backend/utils/mb/Unicode/utf8_to_KOI8.map,v
> retrieving revision 1.1
> diff -u -3 -p -u -r1.1 utf8_to_KOI8.map
> --- src/backend/utils/mb/Unicode/utf8_to_KOI8.map 2001/04/29
> 07:27:38 1.1
> +++ src/backend/utils/mb/Unicode/utf8_to_KOI8.map 2001/05/03
> 18:24:14
> @@ -6,6 +6,9 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
>    {0xc2b7, 0x009e},
>    {0xc3b7, 0x009f},
>    {0xd081, 0x00b3},
> +  {0xd084, 0x00b4},
> +  {0xd086, 0x00b6},
> +  {0xd087, 0x00b7},
>    {0xd090, 0x00e1},
>    {0xd091, 0x00e2},
>    {0xd092, 0x00f7},
> @@ -71,6 +74,11 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
>    {0xd18e, 0x00c0},
>    {0xd18f, 0x00d1},
>    {0xd191, 0x00a3},
> +  {0xd194, 0x00a4},
> +  {0xd196, 0x00a6},
> +  {0xd197, 0x00a7},
> +  {0xd290, 0x00bd},
> +  {0xd291, 0x00ad},
>    {0xe28899, 0x0095},
>    {0xe2889a, 0x0096},
>    {0xe28988, 0x0097},
> @@ -92,31 +100,23 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
>    {0xe29590, 0x00a0},
>    {0xe29591, 0x00a1},
>    {0xe29592, 0x00a2},
> -  {0xe29593, 0x00a4},
>    {0xe29594, 0x00a5},
> -  {0xe29595, 0x00a6},
> -  {0xe29596, 0x00a7},
>    {0xe29597, 0x00a8},
>    {0xe29598, 0x00a9},
>    {0xe29599, 0x00aa},
>    {0xe2959a, 0x00ab},
>    {0xe2959b, 0x00ac},
> -  {0xe2959c, 0x00ad},
>    {0xe2959d, 0x00ae},
>    {0xe2959e, 0x00af},
>    {0xe2959f, 0x00b0},
>    {0xe295a0, 0x00b1},
>    {0xe295a1, 0x00b2},
> -  {0xe295a2, 0x00b4},
>    {0xe295a3, 0x00b5},
> -  {0xe295a4, 0x00b6},
> -  {0xe295a5, 0x00b7},
>    {0xe295a6, 0x00b8},
>    {0xe295a7, 0x00b9},
>    {0xe295a8, 0x00ba},
>    {0xe295a9, 0x00bb},
>    {0xe295aa, 0x00bc},
> -  {0xe295ab, 0x00bd},
>    {0xe295ac, 0x00be},
>    {0xe29680, 0x008b},
>    {0xe29684, 0x008c},
> Index: src/include/mb/pg_wchar.h
> ===================================================================
> RCS file:
> /home/projects/pgsql/cvsroot/pgsql/src/include/mb/pg_wchar.h,v
> retrieving revision 1.25
> diff -u -3 -p -u -r1.25 pg_wchar.h
> --- src/include/mb/pg_wchar.h 2001/03/22 04:00:49 1.25
> +++ src/include/mb/pg_wchar.h 2001/05/03 18:24:16
> @@ -28,7 +28,7 @@
>  #define LATIN7 13    /* ISO-8859 Latin 7 */
>  #define LATIN8 14    /* ISO-8859 Latin 8 */
>  #define LATIN9 15    /* ISO-8859 Latin 9 */
> -#define KOI8   16    /* KOI8-R */
> +#define KOI8   16    /* KOI8-R/U */
>  #define WIN    17    /* windows-1251 */
>  #define ALT    18    /* Alternativny Variant (MS-DOS CP866) */
>  /* followings are for client encoding only */
> @@ -68,6 +68,7 @@ typedef unsigned int pg_wchar;
>  #define LC_JISX0201K 0x89 /* Japanese 1 byte kana */
>  #define LC_JISX0201R 0x8a /* Japanese 1 byte Roman */
>  #define LC_KOI8_R 0x8c  /* Cyrillic KOI8-R */
> +#define LC_KOI8_U 0x8c  /* Cyrillic KOI8-U */
>  #define LC_GB2312_80 0x91 /* Chinese */
>  #define LC_JISX0208 0x92  /* Japanese Kanji */
>  #define LC_KS5601 0x93  /* Korean */
> Index: src/interfaces/odbc/multibyte.h
> ===================================================================
> RCS file:
>
/home/projects/pgsql/cvsroot/pgsql/src/interfaces/odbc/multibyte.h,v
> retrieving revision 1.3
> diff -u -3 -p -u -r1.3 multibyte.h
> --- src/interfaces/odbc/multibyte.h 2001/03/27 04:00:54 1.3
> +++ src/interfaces/odbc/multibyte.h 2001/05/03 18:24:21
> @@ -21,7 +21,7 @@
>  #define LATIN7    13 /* ISO-8859 Latin 7 */
>  #define LATIN8    14 /* ISO-8859 Latin 8 */
>  #define LATIN9    15 /* ISO-8859 Latin 9 */
> -#define KOI8    16 /* KOI8-R */
> +#define KOI8    16 /* KOI8-R/U */
>  #define WIN     17 /* windows-1251 */
>  #define ALT     18 /* Alternativny Variant (MS-DOS CP866) */
>  #define SJIS    32 /* Shift JIS */
>

__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/

Well, KOI8-R is Russian and it was extended to support Ukrainian
(additional 4 pair of letters) in new set named KOI8-U. But
defenitely both of them have nothing to do with Korean. :-)

take care,
Andy

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Here is the actual patch I applied.  Seems to only affect Korean.
>
> I thought KOI8 was Russian?
>
>             regards, tom lane


__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/

that's just me again, here's normal patch for KOI8_U to
jdbc/Connection.java

Andy
P.S. in Connection.java if encoding=="WIN" then dbEncoding is set to
"Cp1252".
What if it's Cyrillic "WIN"? Than it should be "Cp1251". Is there any
way to fix that without making different "WIN" encodings in
PostgreSQL?

--- Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> I did
> not apply this part of the patch:
>
>     +//#ifndef HAVE_KOI8_U_IN_JDK
>                dbEncoding = "KOI8_R";
>     +//#else
>     +// if you have KOI8_U conversion classes - we have to put it as
> parameter in 'congigure'
>     +//          dbEncoding = "KOI8_U";
>     +//#endif


__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - buy the things you want at great prices
http://auctions.yahoo.com/

Attachment

Re: Fwd: Re: Re: patch to support KOI8-U <==> UTF-8 conversions (2nd try)

From
Bruce Momjian
Date:
> Great but what do you mean by 'affect Korean'???

Oops, sorry.  I thought this was fixing Korean encoding.  You now know
how little I know about locales and multibyte.


>
> Andy
>
> --- Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> > Here is the actual patch I applied.  Seems to only affect Korean.
> > I did
> > not apply this part of the patch:
> >
> >  +//#ifndef HAVE_KOI8_U_IN_JDK
> >             dbEncoding = "KOI8_R";
> >  +//#else
> >  +// if you have KOI8_U conversion classes - we have to put it as
> > parameter in 'congigure'
> >  +//          dbEncoding = "KOI8_U";
> >  +//#endif
> >
> > --
> >   Bruce Momjian                        |  http://candle.pha.pa.us
> >   pgman@candle.pha.pa.us               |  (610) 853-3000
> >   +  If your life is a hard drive,     |  830 Blythe Avenue
> >   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania
> > 19026
> > > Index: doc/src/sgml/charset.sgml
> > ===================================================================
> > RCS file:
> > /home/projects/pgsql/cvsroot/pgsql/doc/src/sgml/charset.sgml,v
> > retrieving revision 2.7
> > diff -u -3 -p -u -r2.7 charset.sgml
> > --- doc/src/sgml/charset.sgml 2001/04/20 15:52:33 2.7
> > +++ doc/src/sgml/charset.sgml 2001/05/03 18:24:13
> > @@ -353,7 +353,7 @@ perl: warning: Falling back to the stand
> >   </row>
> >   <row>
> >    <entry>KOI8</entry>
> > -  <entry>KOI8-R</entry>
> > +  <entry>KOI8-R(U)</entry>
> >   </row>
> >   <row>
> >    <entry>WIN</entry>
> > Index: src/backend/utils/mb/Unicode/KOI8_to_utf8.map
> > ===================================================================
> > RCS file:
> >
> /home/projects/pgsql/cvsroot/pgsql/src/backend/utils/mb/Unicode/KOI8_to_utf8.map,v
> > retrieving revision 1.1
> > diff -u -3 -p -u -r1.1 KOI8_to_utf8.map
> > --- src/backend/utils/mb/Unicode/KOI8_to_utf8.map 2001/04/29
> > 07:27:38 1.1
> > +++ src/backend/utils/mb/Unicode/KOI8_to_utf8.map 2001/05/03
> > 18:24:14
> > @@ -35,32 +35,32 @@ static pg_local_to_utf LUmapKOI8[ 128 ]
> >    {0x00a1, 0xe29591},
> >    {0x00a2, 0xe29592},
> >    {0x00a3, 0xd191},
> > -  {0x00a4, 0xe29593},
> > +  {0x00a4, 0xd194},
> >    {0x00a5, 0xe29594},
> > -  {0x00a6, 0xe29595},
> > -  {0x00a7, 0xe29596},
> > +  {0x00a6, 0xd196},
> > +  {0x00a7, 0xd197},
> >    {0x00a8, 0xe29597},
> >    {0x00a9, 0xe29598},
> >    {0x00aa, 0xe29599},
> >    {0x00ab, 0xe2959a},
> >    {0x00ac, 0xe2959b},
> > -  {0x00ad, 0xe2959c},
> > +  {0x00ad, 0xd291},
> >    {0x00ae, 0xe2959d},
> >    {0x00af, 0xe2959e},
> >    {0x00b0, 0xe2959f},
> >    {0x00b1, 0xe295a0},
> >    {0x00b2, 0xe295a1},
> >    {0x00b3, 0xd081},
> > -  {0x00b4, 0xe295a2},
> > +  {0x00b4, 0xd084},
> >    {0x00b5, 0xe295a3},
> > -  {0x00b6, 0xe295a4},
> > -  {0x00b7, 0xe295a5},
> > +  {0x00b6, 0xd086},
> > +  {0x00b7, 0xd087},
> >    {0x00b8, 0xe295a6},
> >    {0x00b9, 0xe295a7},
> >    {0x00ba, 0xe295a8},
> >    {0x00bb, 0xe295a9},
> >    {0x00bc, 0xe295aa},
> > -  {0x00bd, 0xe295ab},
> > +  {0x00bd, 0xd290},
> >    {0x00be, 0xe295ac},
> >    {0x00bf, 0xc2a9},
> >    {0x00c0, 0xd18e},
> > Index: src/backend/utils/mb/Unicode/utf8_to_KOI8.map
> > ===================================================================
> > RCS file:
> >
> /home/projects/pgsql/cvsroot/pgsql/src/backend/utils/mb/Unicode/utf8_to_KOI8.map,v
> > retrieving revision 1.1
> > diff -u -3 -p -u -r1.1 utf8_to_KOI8.map
> > --- src/backend/utils/mb/Unicode/utf8_to_KOI8.map 2001/04/29
> > 07:27:38 1.1
> > +++ src/backend/utils/mb/Unicode/utf8_to_KOI8.map 2001/05/03
> > 18:24:14
> > @@ -6,6 +6,9 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
> >    {0xc2b7, 0x009e},
> >    {0xc3b7, 0x009f},
> >    {0xd081, 0x00b3},
> > +  {0xd084, 0x00b4},
> > +  {0xd086, 0x00b6},
> > +  {0xd087, 0x00b7},
> >    {0xd090, 0x00e1},
> >    {0xd091, 0x00e2},
> >    {0xd092, 0x00f7},
> > @@ -71,6 +74,11 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
> >    {0xd18e, 0x00c0},
> >    {0xd18f, 0x00d1},
> >    {0xd191, 0x00a3},
> > +  {0xd194, 0x00a4},
> > +  {0xd196, 0x00a6},
> > +  {0xd197, 0x00a7},
> > +  {0xd290, 0x00bd},
> > +  {0xd291, 0x00ad},
> >    {0xe28899, 0x0095},
> >    {0xe2889a, 0x0096},
> >    {0xe28988, 0x0097},
> > @@ -92,31 +100,23 @@ static pg_utf_to_local ULmap_KOI8[ 128 ]
> >    {0xe29590, 0x00a0},
> >    {0xe29591, 0x00a1},
> >    {0xe29592, 0x00a2},
> > -  {0xe29593, 0x00a4},
> >    {0xe29594, 0x00a5},
> > -  {0xe29595, 0x00a6},
> > -  {0xe29596, 0x00a7},
> >    {0xe29597, 0x00a8},
> >    {0xe29598, 0x00a9},
> >    {0xe29599, 0x00aa},
> >    {0xe2959a, 0x00ab},
> >    {0xe2959b, 0x00ac},
> > -  {0xe2959c, 0x00ad},
> >    {0xe2959d, 0x00ae},
> >    {0xe2959e, 0x00af},
> >    {0xe2959f, 0x00b0},
> >    {0xe295a0, 0x00b1},
> >    {0xe295a1, 0x00b2},
> > -  {0xe295a2, 0x00b4},
> >    {0xe295a3, 0x00b5},
> > -  {0xe295a4, 0x00b6},
> > -  {0xe295a5, 0x00b7},
> >    {0xe295a6, 0x00b8},
> >    {0xe295a7, 0x00b9},
> >    {0xe295a8, 0x00ba},
> >    {0xe295a9, 0x00bb},
> >    {0xe295aa, 0x00bc},
> > -  {0xe295ab, 0x00bd},
> >    {0xe295ac, 0x00be},
> >    {0xe29680, 0x008b},
> >    {0xe29684, 0x008c},
> > Index: src/include/mb/pg_wchar.h
> > ===================================================================
> > RCS file:
> > /home/projects/pgsql/cvsroot/pgsql/src/include/mb/pg_wchar.h,v
> > retrieving revision 1.25
> > diff -u -3 -p -u -r1.25 pg_wchar.h
> > --- src/include/mb/pg_wchar.h 2001/03/22 04:00:49 1.25
> > +++ src/include/mb/pg_wchar.h 2001/05/03 18:24:16
> > @@ -28,7 +28,7 @@
> >  #define LATIN7 13    /* ISO-8859 Latin 7 */
> >  #define LATIN8 14    /* ISO-8859 Latin 8 */
> >  #define LATIN9 15    /* ISO-8859 Latin 9 */
> > -#define KOI8   16    /* KOI8-R */
> > +#define KOI8   16    /* KOI8-R/U */
> >  #define WIN    17    /* windows-1251 */
> >  #define ALT    18    /* Alternativny Variant (MS-DOS CP866) */
> >  /* followings are for client encoding only */
> > @@ -68,6 +68,7 @@ typedef unsigned int pg_wchar;
> >  #define LC_JISX0201K 0x89 /* Japanese 1 byte kana */
> >  #define LC_JISX0201R 0x8a /* Japanese 1 byte Roman */
> >  #define LC_KOI8_R 0x8c  /* Cyrillic KOI8-R */
> > +#define LC_KOI8_U 0x8c  /* Cyrillic KOI8-U */
> >  #define LC_GB2312_80 0x91 /* Chinese */
> >  #define LC_JISX0208 0x92  /* Japanese Kanji */
> >  #define LC_KS5601 0x93  /* Korean */
> > Index: src/interfaces/odbc/multibyte.h
> > ===================================================================
> > RCS file:
> >
> /home/projects/pgsql/cvsroot/pgsql/src/interfaces/odbc/multibyte.h,v
> > retrieving revision 1.3
> > diff -u -3 -p -u -r1.3 multibyte.h
> > --- src/interfaces/odbc/multibyte.h 2001/03/27 04:00:54 1.3
> > +++ src/interfaces/odbc/multibyte.h 2001/05/03 18:24:21
> > @@ -21,7 +21,7 @@
> >  #define LATIN7    13 /* ISO-8859 Latin 7 */
> >  #define LATIN8    14 /* ISO-8859 Latin 8 */
> >  #define LATIN9    15 /* ISO-8859 Latin 9 */
> > -#define KOI8    16 /* KOI8-R */
> > +#define KOI8    16 /* KOI8-R/U */
> >  #define WIN     17 /* windows-1251 */
> >  #define ALT     18 /* Alternativny Variant (MS-DOS CP866) */
> >  #define SJIS    32 /* Shift JIS */
> >
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Auctions - buy the things you want at great prices
> http://auctions.yahoo.com/
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Here is the actual patch I applied.  Seems to only affect Korean.

I thought KOI8 was Russian?

            regards, tom lane

Re: Fwd: Re: Re: patch to support KOI8-U <==> UTF-8 conversions (2nd try)

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Here is the actual patch I applied.  Seems to only affect Korean.
>
> I thought KOI8 was Russian?

Oh, thanks.  I was going to guess Kanji as my second guess.  :-)

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Sorry, I can't patch this into 7.1.1.  It is a feature addition with
little testing, and it touches the man jdbc code.  I will keep it for
7.2.

> that's just me again, here's normal patch for KOI8_U to
> jdbc/Connection.java
>
> Andy
> P.S. in Connection.java if encoding=="WIN" then dbEncoding is set to
> "Cp1252".
> What if it's Cyrillic "WIN"? Than it should be "Cp1251". Is there any
> way to fix that without making different "WIN" encodings in
> PostgreSQL?
>
> --- Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> > I did
> > not apply this part of the patch:
> >
> >     +//#ifndef HAVE_KOI8_U_IN_JDK
> >                dbEncoding = "KOI8_R";
> >     +//#else
> >     +// if you have KOI8_U conversion classes - we have to put it as
> > parameter in 'congigure'
> >     +//          dbEncoding = "KOI8_U";
> >     +//#endif
>
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Auctions - buy the things you want at great prices
> http://auctions.yahoo.com/

Content-Description: pgsql-7.1-koi8-u.jdbc.patch

[ Attachment, skipping... ]

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Thanks.  Applied.


> that's just me again, here's normal patch for KOI8_U to
> jdbc/Connection.java
>
> Andy
> P.S. in Connection.java if encoding=="WIN" then dbEncoding is set to
> "Cp1252".
> What if it's Cyrillic "WIN"? Than it should be "Cp1251". Is there any
> way to fix that without making different "WIN" encodings in
> PostgreSQL?
>
> --- Bruce Momjian <pgman@candle.pha.pa.us> wrote:
> > I did
> > not apply this part of the patch:
> >
> >     +//#ifndef HAVE_KOI8_U_IN_JDK
> >                dbEncoding = "KOI8_R";
> >     +//#else
> >     +// if you have KOI8_U conversion classes - we have to put it as
> > parameter in 'congigure'
> >     +//          dbEncoding = "KOI8_U";
> >     +//#endif
>
>
> __________________________________________________
> Do You Yahoo!?
> Yahoo! Auctions - buy the things you want at great prices
> http://auctions.yahoo.com/

Content-Description: pgsql-7.1-koi8-u.jdbc.patch

[ Attachment, skipping... ]

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026