Cyrillic to UNICODE conversion - Mailing list pgsql-patches

From Victor Wagner
Subject Cyrillic to UNICODE conversion
Date
Msg-id Pine.LNX.4.30.0104262041500.9539-101000@party.ice.ru
Whole thread Raw
Responses Re: Cyrillic to UNICODE conversion  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-patches
Despite of advertized support of Unicode to other charset conversion,
PostgreSQL-7.1 reports that Conversion of UNICODE to KOI8 is not
supported. Same for WIN, ALT and other charsets.

As I found out, it was simply forgotten to add these charsets to list
of 8-bit charsets which should be converted. May be becouse their maps
are stored in another directory on ftp.unicode.org (see VENDORS/MicroSoft
for cp1251 and cp866 maps, and somewhere else for KOI8-R.TXT. At least all
those maps are included in the catdoc distribution)

Attached patch fixes this problem. It adds script UCS_to_cyrillic.pl
into src/backend/utils/mb/Unicode directory. Mapping of the PostgreSQL
charset names to filenames (as they appear in catdoc distribution, i.e.
lowercased) is hardcoded into script. It is almost exact copy of
UCS_to_iso script, with only file and constant names changed.

Generated maps are included in the patch, as they are included in the
source tarball, and maps are omitted, becouse they are removed by
make distclean

file src/backend/mb/conv.c is modified
to include new maps and provide appropriate conversion functions



--
Victor Wagner            vitus@ice.ru
Chief Technical Officer        Office:7-(095)-748-53-88
Communiware.Net         Home: 7-(095)-135-46-61
http://www.communiware.net      http://www.ice.ru/~vitus

Attachment

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: ANALYZE command [REPOST]
Next
From: "Dominic J. Eidson"
Date:
Subject: Patch to include PAM support...