Codepage Win1252 - Mailing list pgsql-general
From | Jörg Schulz |
---|---|
Subject | Codepage Win1252 |
Date | |
Msg-id | bjru5n$7uk$1@news.hub.org Whole thread Raw |
List | pgsql-general |
I am missing this codepage quiet some time, but I was able to patch another unneeded mapping to my needs. Unfortunately I wasn't able to add a complete new mapping. Maybe someone of you can do this better... :-) I added some tiny scripts that generate at least the needed mappings in the src/backend/utils/mb/Unicode/*.map files. Hope this helps to get PostgreSQL support more codepages. Jörg jschulz@opal:~/programme/postgresql/pgmaps> cat README Do a copy and paste from a codepage reference under http://www.microsoft.com/globaldev/reference/cphome.mspx For example win1252 was copied from http://www.microsoft.com/globaldev/reference/sbcs/1252.htm then type e.g. make_pgmaps win1252 ... jschulz@opal:~/programme/postgresql/pgmaps> cat make_pgmaps #!/bin/bash for f in $*; do echo -e "${f}: ${f}_to_utf8.map...\c" ./codepage_to_utf8 ${f} > ${f}_to_utf8.map echo -e "ok utf8_to_${f}.map...\c" ./utf8_to_codepage ${f} > utf8_to_${f}.map echo "ok" done jschulz@opal:~/programme/postgresql/pgmaps> cat codepage_to_utf8 #!/bin/bash while read l; do cp=`echo "$l" | cut -c1-2` u16=`echo "$l" | cut -c8-11` u8=`echo "0x$u16" | recode utf-16/x4..utf-8/x4` echo " {0x00$cp, $u8}," done < $1 | awk '{print tolower($0)}' jschulz@opal:~/programme/postgresql/pgmaps> cat utf8_to_codepage #!/bin/bash while read l; do cp=`echo "$l" | cut -c1-2` u16=`echo "$l" | cut -c8-11` u8=`echo "0x$u16" | recode utf-16/x4..utf-8/x4` echo " {$u8, 0x00$cp}," done < $1 | awk '{print tolower($0)}' | sort jschulz@opal:~/programme/postgresql/pgmaps> cat win1252 80 = U+20AC : EURO SIGN 82 = U+201A : SINGLE LOW-9 QUOTATION MARK 83 = U+0192 : LATIN SMALL LETTER F WITH HOOK 84 = U+201E : DOUBLE LOW-9 QUOTATION MARK 85 = U+2026 : HORIZONTAL ELLIPSIS 86 = U+2020 : DAGGER 87 = U+2021 : DOUBLE DAGGER 88 = U+02C6 : MODIFIER LETTER CIRCUMFLEX ACCENT 89 = U+2030 : PER MILLE SIGN 8A = U+0160 : LATIN CAPITAL LETTER S WITH CARON 8B = U+2039 : SINGLE LEFT-POINTING ANGLE QUOTATION MARK 8C = U+0152 : LATIN CAPITAL LIGATURE OE 8E = U+017D : LATIN CAPITAL LETTER Z WITH CARON 91 = U+2018 : LEFT SINGLE QUOTATION MARK 92 = U+2019 : RIGHT SINGLE QUOTATION MARK 93 = U+201C : LEFT DOUBLE QUOTATION MARK 94 = U+201D : RIGHT DOUBLE QUOTATION MARK 95 = U+2022 : BULLET 96 = U+2013 : EN DASH 97 = U+2014 : EM DASH 98 = U+02DC : SMALL TILDE 99 = U+2122 : TRADE MARK SIGN 9A = U+0161 : LATIN SMALL LETTER S WITH CARON 9B = U+203A : SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 9C = U+0153 : LATIN SMALL LIGATURE OE 9E = U+017E : LATIN SMALL LETTER Z WITH CARON 9F = U+0178 : LATIN CAPITAL LETTER Y WITH DIAERESIS A0 = U+00A0 : NO-BREAK SPACE A1 = U+00A1 : INVERTED EXCLAMATION MARK A2 = U+00A2 : CENT SIGN A3 = U+00A3 : POUND SIGN A4 = U+00A4 : CURRENCY SIGN A5 = U+00A5 : YEN SIGN A6 = U+00A6 : BROKEN BAR A7 = U+00A7 : SECTION SIGN A8 = U+00A8 : DIAERESIS A9 = U+00A9 : COPYRIGHT SIGN AA = U+00AA : FEMININE ORDINAL INDICATOR AB = U+00AB : LEFT-POINTING DOUBLE ANGLE QUOTATION MARK AC = U+00AC : NOT SIGN AD = U+00AD : SOFT HYPHEN AE = U+00AE : REGISTERED SIGN AF = U+00AF : MACRON B0 = U+00B0 : DEGREE SIGN B1 = U+00B1 : PLUS-MINUS SIGN B2 = U+00B2 : SUPERSCRIPT TWO B3 = U+00B3 : SUPERSCRIPT THREE B4 = U+00B4 : ACUTE ACCENT B5 = U+00B5 : MICRO SIGN B6 = U+00B6 : PILCROW SIGN B7 = U+00B7 : MIDDLE DOT B8 = U+00B8 : CEDILLA B9 = U+00B9 : SUPERSCRIPT ONE BA = U+00BA : MASCULINE ORDINAL INDICATOR BB = U+00BB : RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK BC = U+00BC : VULGAR FRACTION ONE QUARTER BD = U+00BD : VULGAR FRACTION ONE HALF BE = U+00BE : VULGAR FRACTION THREE QUARTERS BF = U+00BF : INVERTED QUESTION MARK C0 = U+00C0 : LATIN CAPITAL LETTER A WITH GRAVE C1 = U+00C1 : LATIN CAPITAL LETTER A WITH ACUTE C2 = U+00C2 : LATIN CAPITAL LETTER A WITH CIRCUMFLEX C3 = U+00C3 : LATIN CAPITAL LETTER A WITH TILDE C4 = U+00C4 : LATIN CAPITAL LETTER A WITH DIAERESIS C5 = U+00C5 : LATIN CAPITAL LETTER A WITH RING ABOVE C6 = U+00C6 : LATIN CAPITAL LETTER AE C7 = U+00C7 : LATIN CAPITAL LETTER C WITH CEDILLA C8 = U+00C8 : LATIN CAPITAL LETTER E WITH GRAVE C9 = U+00C9 : LATIN CAPITAL LETTER E WITH ACUTE CA = U+00CA : LATIN CAPITAL LETTER E WITH CIRCUMFLEX CB = U+00CB : LATIN CAPITAL LETTER E WITH DIAERESIS CC = U+00CC : LATIN CAPITAL LETTER I WITH GRAVE CD = U+00CD : LATIN CAPITAL LETTER I WITH ACUTE CE = U+00CE : LATIN CAPITAL LETTER I WITH CIRCUMFLEX CF = U+00CF : LATIN CAPITAL LETTER I WITH DIAERESIS D0 = U+00D0 : LATIN CAPITAL LETTER ETH D1 = U+00D1 : LATIN CAPITAL LETTER N WITH TILDE D2 = U+00D2 : LATIN CAPITAL LETTER O WITH GRAVE D3 = U+00D3 : LATIN CAPITAL LETTER O WITH ACUTE D4 = U+00D4 : LATIN CAPITAL LETTER O WITH CIRCUMFLEX D5 = U+00D5 : LATIN CAPITAL LETTER O WITH TILDE D6 = U+00D6 : LATIN CAPITAL LETTER O WITH DIAERESIS D7 = U+00D7 : MULTIPLICATION SIGN D8 = U+00D8 : LATIN CAPITAL LETTER O WITH STROKE D9 = U+00D9 : LATIN CAPITAL LETTER U WITH GRAVE DA = U+00DA : LATIN CAPITAL LETTER U WITH ACUTE DB = U+00DB : LATIN CAPITAL LETTER U WITH CIRCUMFLEX DC = U+00DC : LATIN CAPITAL LETTER U WITH DIAERESIS DD = U+00DD : LATIN CAPITAL LETTER Y WITH ACUTE DE = U+00DE : LATIN CAPITAL LETTER THORN DF = U+00DF : LATIN SMALL LETTER SHARP S E0 = U+00E0 : LATIN SMALL LETTER A WITH GRAVE E1 = U+00E1 : LATIN SMALL LETTER A WITH ACUTE E2 = U+00E2 : LATIN SMALL LETTER A WITH CIRCUMFLEX E3 = U+00E3 : LATIN SMALL LETTER A WITH TILDE E4 = U+00E4 : LATIN SMALL LETTER A WITH DIAERESIS E5 = U+00E5 : LATIN SMALL LETTER A WITH RING ABOVE E6 = U+00E6 : LATIN SMALL LETTER AE E7 = U+00E7 : LATIN SMALL LETTER C WITH CEDILLA E8 = U+00E8 : LATIN SMALL LETTER E WITH GRAVE E9 = U+00E9 : LATIN SMALL LETTER E WITH ACUTE EA = U+00EA : LATIN SMALL LETTER E WITH CIRCUMFLEX EB = U+00EB : LATIN SMALL LETTER E WITH DIAERESIS EC = U+00EC : LATIN SMALL LETTER I WITH GRAVE ED = U+00ED : LATIN SMALL LETTER I WITH ACUTE EE = U+00EE : LATIN SMALL LETTER I WITH CIRCUMFLEX EF = U+00EF : LATIN SMALL LETTER I WITH DIAERESIS F0 = U+00F0 : LATIN SMALL LETTER ETH F1 = U+00F1 : LATIN SMALL LETTER N WITH TILDE F2 = U+00F2 : LATIN SMALL LETTER O WITH GRAVE F3 = U+00F3 : LATIN SMALL LETTER O WITH ACUTE F4 = U+00F4 : LATIN SMALL LETTER O WITH CIRCUMFLEX F5 = U+00F5 : LATIN SMALL LETTER O WITH TILDE F6 = U+00F6 : LATIN SMALL LETTER O WITH DIAERESIS F7 = U+00F7 : DIVISION SIGN F8 = U+00F8 : LATIN SMALL LETTER O WITH STROKE F9 = U+00F9 : LATIN SMALL LETTER U WITH GRAVE FA = U+00FA : LATIN SMALL LETTER U WITH ACUTE FB = U+00FB : LATIN SMALL LETTER U WITH CIRCUMFLEX FC = U+00FC : LATIN SMALL LETTER U WITH DIAERESIS FD = U+00FD : LATIN SMALL LETTER Y WITH ACUTE FE = U+00FE : LATIN SMALL LETTER THORN FF = U+00FF : LATIN SMALL LETTER Y WITH DIAERESIS jschulz@opal:~/programme/postgresql/pgmaps> cat utf8_to_win1252.map {0xc2a0, 0x00a0}, {0xc2a1, 0x00a1}, {0xc2a2, 0x00a2}, {0xc2a3, 0x00a3}, {0xc2a4, 0x00a4}, {0xc2a5, 0x00a5}, {0xc2a6, 0x00a6}, {0xc2a7, 0x00a7}, {0xc2a8, 0x00a8}, {0xc2a9, 0x00a9}, {0xc2aa, 0x00aa}, {0xc2ab, 0x00ab}, {0xc2ac, 0x00ac}, {0xc2ad, 0x00ad}, {0xc2ae, 0x00ae}, {0xc2af, 0x00af}, {0xc2b0, 0x00b0}, {0xc2b1, 0x00b1}, {0xc2b2, 0x00b2}, {0xc2b3, 0x00b3}, {0xc2b4, 0x00b4}, {0xc2b5, 0x00b5}, {0xc2b6, 0x00b6}, {0xc2b7, 0x00b7}, {0xc2b8, 0x00b8}, {0xc2b9, 0x00b9}, {0xc2ba, 0x00ba}, {0xc2bb, 0x00bb}, {0xc2bc, 0x00bc}, {0xc2bd, 0x00bd}, {0xc2be, 0x00be}, {0xc2bf, 0x00bf}, {0xc380, 0x00c0}, {0xc381, 0x00c1}, {0xc382, 0x00c2}, {0xc383, 0x00c3}, {0xc384, 0x00c4}, {0xc385, 0x00c5}, {0xc386, 0x00c6}, {0xc387, 0x00c7}, {0xc388, 0x00c8}, {0xc389, 0x00c9}, {0xc38a, 0x00ca}, {0xc38b, 0x00cb}, {0xc38c, 0x00cc}, {0xc38d, 0x00cd}, {0xc38e, 0x00ce}, {0xc38f, 0x00cf}, {0xc390, 0x00d0}, {0xc391, 0x00d1}, {0xc392, 0x00d2}, {0xc393, 0x00d3}, {0xc394, 0x00d4}, {0xc395, 0x00d5}, {0xc396, 0x00d6}, {0xc397, 0x00d7}, {0xc398, 0x00d8}, {0xc399, 0x00d9}, {0xc39a, 0x00da}, {0xc39b, 0x00db}, {0xc39c, 0x00dc}, {0xc39d, 0x00dd}, {0xc39e, 0x00de}, {0xc39f, 0x00df}, {0xc3a0, 0x00e0}, {0xc3a1, 0x00e1}, {0xc3a2, 0x00e2}, {0xc3a3, 0x00e3}, {0xc3a4, 0x00e4}, {0xc3a5, 0x00e5}, {0xc3a6, 0x00e6}, {0xc3a7, 0x00e7}, {0xc3a8, 0x00e8}, {0xc3a9, 0x00e9}, {0xc3aa, 0x00ea}, {0xc3ab, 0x00eb}, {0xc3ac, 0x00ec}, {0xc3ad, 0x00ed}, {0xc3ae, 0x00ee}, {0xc3af, 0x00ef}, {0xc3b0, 0x00f0}, {0xc3b1, 0x00f1}, {0xc3b2, 0x00f2}, {0xc3b3, 0x00f3}, {0xc3b4, 0x00f4}, {0xc3b5, 0x00f5}, {0xc3b6, 0x00f6}, {0xc3b7, 0x00f7}, {0xc3b8, 0x00f8}, {0xc3b9, 0x00f9}, {0xc3ba, 0x00fa}, {0xc3bb, 0x00fb}, {0xc3bc, 0x00fc}, {0xc3bd, 0x00fd}, {0xc3be, 0x00fe}, {0xc3bf, 0x00ff}, {0xc592, 0x008c}, {0xc593, 0x009c}, {0xc5a0, 0x008a}, {0xc5a1, 0x009a}, {0xc5b8, 0x009f}, {0xc5bd, 0x008e}, {0xc5be, 0x009e}, {0xc692, 0x0083}, {0xcb86, 0x0088}, {0xcb9c, 0x0098}, {0xe28093, 0x0096}, {0xe28094, 0x0097}, {0xe28098, 0x0091}, {0xe28099, 0x0092}, {0xe2809a, 0x0082}, {0xe2809c, 0x0093}, {0xe2809d, 0x0094}, {0xe2809e, 0x0084}, {0xe280a0, 0x0086}, {0xe280a1, 0x0087}, {0xe280a2, 0x0095}, {0xe280a6, 0x0085}, {0xe280b0, 0x0089}, {0xe280b9, 0x008b}, {0xe280ba, 0x009b}, {0xe282ac, 0x0080}, {0xe284a2, 0x0099}, jschulz@opal:~/programme/postgresql/pgmaps> cat win1252_to_utf8.map {0x0080, 0xe282ac}, {0x0082, 0xe2809a}, {0x0083, 0xc692}, {0x0084, 0xe2809e}, {0x0085, 0xe280a6}, {0x0086, 0xe280a0}, {0x0087, 0xe280a1}, {0x0088, 0xcb86}, {0x0089, 0xe280b0}, {0x008a, 0xc5a0}, {0x008b, 0xe280b9}, {0x008c, 0xc592}, {0x008e, 0xc5bd}, {0x0091, 0xe28098}, {0x0092, 0xe28099}, {0x0093, 0xe2809c}, {0x0094, 0xe2809d}, {0x0095, 0xe280a2}, {0x0096, 0xe28093}, {0x0097, 0xe28094}, {0x0098, 0xcb9c}, {0x0099, 0xe284a2}, {0x009a, 0xc5a1}, {0x009b, 0xe280ba}, {0x009c, 0xc593}, {0x009e, 0xc5be}, {0x009f, 0xc5b8}, {0x00a0, 0xc2a0}, {0x00a1, 0xc2a1}, {0x00a2, 0xc2a2}, {0x00a3, 0xc2a3}, {0x00a4, 0xc2a4}, {0x00a5, 0xc2a5}, {0x00a6, 0xc2a6}, {0x00a7, 0xc2a7}, {0x00a8, 0xc2a8}, {0x00a9, 0xc2a9}, {0x00aa, 0xc2aa}, {0x00ab, 0xc2ab}, {0x00ac, 0xc2ac}, {0x00ad, 0xc2ad}, {0x00ae, 0xc2ae}, {0x00af, 0xc2af}, {0x00b0, 0xc2b0}, {0x00b1, 0xc2b1}, {0x00b2, 0xc2b2}, {0x00b3, 0xc2b3}, {0x00b4, 0xc2b4}, {0x00b5, 0xc2b5}, {0x00b6, 0xc2b6}, {0x00b7, 0xc2b7}, {0x00b8, 0xc2b8}, {0x00b9, 0xc2b9}, {0x00ba, 0xc2ba}, {0x00bb, 0xc2bb}, {0x00bc, 0xc2bc}, {0x00bd, 0xc2bd}, {0x00be, 0xc2be}, {0x00bf, 0xc2bf}, {0x00c0, 0xc380}, {0x00c1, 0xc381}, {0x00c2, 0xc382}, {0x00c3, 0xc383}, {0x00c4, 0xc384}, {0x00c5, 0xc385}, {0x00c6, 0xc386}, {0x00c7, 0xc387}, {0x00c8, 0xc388}, {0x00c9, 0xc389}, {0x00ca, 0xc38a}, {0x00cb, 0xc38b}, {0x00cc, 0xc38c}, {0x00cd, 0xc38d}, {0x00ce, 0xc38e}, {0x00cf, 0xc38f}, {0x00d0, 0xc390}, {0x00d1, 0xc391}, {0x00d2, 0xc392}, {0x00d3, 0xc393}, {0x00d4, 0xc394}, {0x00d5, 0xc395}, {0x00d6, 0xc396}, {0x00d7, 0xc397}, {0x00d8, 0xc398}, {0x00d9, 0xc399}, {0x00da, 0xc39a}, {0x00db, 0xc39b}, {0x00dc, 0xc39c}, {0x00dd, 0xc39d}, {0x00de, 0xc39e}, {0x00df, 0xc39f}, {0x00e0, 0xc3a0}, {0x00e1, 0xc3a1}, {0x00e2, 0xc3a2}, {0x00e3, 0xc3a3}, {0x00e4, 0xc3a4}, {0x00e5, 0xc3a5}, {0x00e6, 0xc3a6}, {0x00e7, 0xc3a7}, {0x00e8, 0xc3a8}, {0x00e9, 0xc3a9}, {0x00ea, 0xc3aa}, {0x00eb, 0xc3ab}, {0x00ec, 0xc3ac}, {0x00ed, 0xc3ad}, {0x00ee, 0xc3ae}, {0x00ef, 0xc3af}, {0x00f0, 0xc3b0}, {0x00f1, 0xc3b1}, {0x00f2, 0xc3b2}, {0x00f3, 0xc3b3}, {0x00f4, 0xc3b4}, {0x00f5, 0xc3b5}, {0x00f6, 0xc3b6}, {0x00f7, 0xc3b7}, {0x00f8, 0xc3b8}, {0x00f9, 0xc3b9}, {0x00fa, 0xc3ba}, {0x00fb, 0xc3bb}, {0x00fc, 0xc3bc}, {0x00fd, 0xc3bd}, {0x00fe, 0xc3be}, {0x00ff, 0xc3bf},
pgsql-general by date: