Re: More message encoding woes - Mailing list pgsql-hackers

From Hiroshi Inoue
Subject Re: More message encoding woes
Date
Msg-id 49D4A989.8020907@tpf.co.jp
Whole thread Raw
In response to Re: More message encoding woes  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: More message encoding woes  (Hiroshi Inoue <inoue@tpf.co.jp>)
Re: More message encoding woes  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> Tom Lane wrote:
>>>> Maybe use a special string "Translate Me First" that
>>>> doesn't actually need to be end-user-visible, just so no one sweats 
>>>> over
>>>> getting it right in context.
>>
>>> Yep, something like that. There seems to be a magic empty string 
>>> translation at the beginning of every po file that returns the 
>>> meta-information about the translation, like translation author and 
>>> date. Assuming that works reliably, I'll use that.
>>
>> At first that sounded like an ideal answer, but I can see a gotcha:
>> suppose the translation's author's name contains some characters that
>> don't convert to the database encoding.  I suppose that would result in
>> failure, when we'd prefer it not to.  A single-purpose string could be
>> documented as "whatever you translate this to should be pure ASCII,
>> never mind if it's sensible".
> 
> I just tried that, and it seems that gettext() does transliteration, so 
> any characters that have no counterpart in the database encoding will be 
> replaced with something similar, or question marks. Assuming that's 
> universal across platforms, and I think it is, using the empty string 
> should work.
> 
> It also means that you can use lc_messages='ja' with 
> server_encoding='latin1', but it will be unreadable because all the 
> non-ascii characters are replaced with question marks. For something 
> like lc_messages='es_ES' and server_encoding='koi8-r', it will still 
> look quite nice.
> 
> Attached is a patch I've been testing. Seems to work quite well. It 
> would be nice if someone could test it on Windows, which seems to be a 
> bit special in this regard.

Unfortunately it doesn't seem to work on Windows.

First any combination of valid lc_messages and non-existent encoding
passes the test  strcmp(gettext(""), "") != 0 .
Second for example the combination of ja(lc_messages) and ISO-8859-1
passes the the test but the test fails after I changed the last_trans
lator part of ja message catalog to contain Japanese kanji characters.

regards,
Hiroshi Inoue


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Bug of ALTER TABLE DROP CONSTRAINT
Next
From: Nikhil Sontakke
Date:
Subject: Re: Bug of ALTER TABLE DROP CONSTRAINT