Thread: iso-8859-15/16 to MULE
I've been looking a bit at the MULE encoding wrt to latin 9 and 10. It seems that there is no support for the Euro at all in it. e.g. when I tried to use "recode", which does recognise iso-8859-15 and 16, and convert to MULE, whatever I do, I obtain "EUR" for the euro sign, OE, oe, s, S, z, Z, "Y for the different characters which are specific to 15 for example, and that's even worse for 16. Should we NOT allow conversion to Mule, or restrict the support, for example by pretending iso-8859-15 is iso-8859-1 (resp. 16 is 2) for conversion from/to mule (i.e. use the 0x81 and 0x82 octet for these encodings) and be done with it ?? (and MENTION it in the docs ;) ). Anyway, I don't see somebody wanting support for the euro using Mule to store its strings... UTF-8 is much more important (and straightforward) to support in that case :) What do you think ? Patrice. -- Patrice Hédé email: patrice hede à islande org www : http://www.islande.org/
> e.g. when I tried to use "recode", which does recognise iso-8859-15 > and 16, and convert to MULE, whatever I do, I obtain "EUR" for the > euro sign, OE, oe, s, S, z, Z, "Y for the different characters which > are specific to 15 for example, and that's even worse for 16. Apparently MULE currently does not support beyond ISO 8859-10 at all. > Should we NOT allow conversion to Mule, or restrict the support, for > example by pretending iso-8859-15 is iso-8859-1 (resp. 16 is 2) for > conversion from/to mule (i.e. use the 0x81 and 0x82 octet for these > encodings) and be done with it ?? (and MENTION it in the docs ;) ). I think that we could negelect MULE encoding support for beyond ISO 8859-10, at least untill MULE "officially" support them. > Anyway, I don't see somebody wanting support for the euro using Mule > to store its strings... UTF-8 is much more important (and > straightforward) to support in that case :) > > What do you think ? Well, the conversion to/from UTF-8 for ISO 8859-10 or later is pretty easy and should be supported, I think. Actually I already have generated mapping tables for these charsets. I will make patches against current and leave it for the core's decision, whether it should be included in 7.2 or not. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > Well, the conversion to/from UTF-8 for ISO 8859-10 or later is pretty > easy and should be supported, I think. Actually I already have > generated mapping tables for these charsets. I will make patches > against current and leave it for the core's decision, whether it > should be included in 7.2 or not. If you are comfortable with these patches then apply them. You know more about multibyte issues than any of the core committee... regards, tom lane
> > Well, the conversion to/from UTF-8 for ISO 8859-10 or later is pretty > > easy and should be supported, I think. Actually I already have > > generated mapping tables for these charsets. I will make patches > > against current and leave it for the core's decision, whether it > > should be included in 7.2 or not. > > If you are comfortable with these patches then apply them. You know > more about multibyte issues than any of the core committee... Ok. I have committed changes to support ISO-8859-6 to 16. 1) Followings are supported ISO-8859 series encoding names. Column 1 is the "official" name and column 2 is the "alias"name. LATIN1 ISO-8859-1 LATIN2 ISO-8859-2 LATIN3 ISO-8859-3 LATIN4 ISO-8859-4 LATIN5 ISO-8859-9 ISO-8859-6 ISO-8859-7 ISO-8859-8 ISO-8859-10 LATIN6 ISO-8859-13 LATIN7 ISO-8859-14 LATIN8 ISO-8859-15 LATIN9 ISO-8859-16 These encodings all support conversions to/from UNICODE(UTF-8). 2) LATIN5 no more means ISO-8859-5, instead ISO-8859-9. This may impact the LATIN5 database backward compatibility. Especiallyin case of conversion between LATIN5 and UNICODE. If you have LATIN5 database and used UNICODE conversion capability,PLEASE CHECK YOUR DATABASE. -- Tatsuo Ishii