Re: Encoding issues - Mailing list pgsql-hackers

From Karel Zak
Subject Re: Encoding issues
Date
Msg-id 20011010093908.A29004@zf.jcu.cz
Whole thread Raw
In response to Encoding issues  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-hackers
On Wed, Oct 10, 2001 at 03:40:25PM +0900, Tatsuo Ishii wrote:
> Receiving a request to add ISO 8859-15 and 16, I review the multibyte
> support code and found several errors in it.
> 
> 1) There is a confusion between "LATIN5" and ISO 8859-5. LATIN5 is not
>    ISO 8859-5, but is actually ISO 8859-9. Should we rename LATIN5 to
>    "ISO8859-5" (or whatever) as the encoding name? I think we should.
>    For your information, here are the correct mapping between ISO
>    8859-n and LATINn.
> 
>    ISO 8859-1    LATIN1
>    ISO 8859-2    LATIN2
>    ISO 8859-3    LATIN3
>    ISO 8859-4    LATIN4
>    ISO 8859-9    LATIN5
>    ISO 8859-10    LATIN6You are right. Now I see some old version of PostgreSQL and thereis this confusion in some
headersand comments too.
 
> 2) The leading characters for some Cyrillic charsets are wrong.
> 
> Currently they are defined as:
> 
> #define LC_KOI8_R    0x8c    /* Cyrillic KOI8-R */
> #define LC_KOI8_U    0x8c    /* Cyrillic KOI8-U */
> #define LC_ISO8859_5    0x8d    /* ISO8859 Cyrillic */
> 
> These should be:
> 
> #define LC_KOI8_R    0x8b    /* Cyrillic KOI8-R */
> #define LC_KOI8_U    0x8b    /* Cyrillic KOI8-U */
> #define LC_ISO8859_5    0x8c    /* ISO8859 Cyrillic */
Again, it's long time in sources too (interesting is that we don't understand some bugreport).

>     The impact of correcting them would be for users who are storing
>     their data into database using MULE internal code. I think they
>     are quite few people using MULE internal code. So we could correct
>     them for 7.2.
> 
> Comments?
I agree with you, make release with know bugs is dirty thing.

> BTW, should we support ISO 8859-6 and beyond for 7.2? There have been
> some requests to do that. Supporting them are actually trivial works,
> should be one day job. The harder part is writing conversion function
> between encodings. However, there is very few demands to do that, I
> guess. If so, we could ommit the conversion capability for 7.2.
> Comments?
You will hear "we are in the feature freeze state.." :-)
   Karel

-- Karel Zak  <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz,
http://mape.jcu.cz


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: Encoding issues
Next
From: Haller Christoph
Date:
Subject: Re: Connections, table locks, disk space