Thread: Problem of a server gettext message.

Problem of a server gettext message.

From
"Hiroshi Saito"
Date:
Hi.

I think this has many problems. However, by the reason the release 
is approaching, this is not the situation which I'm looking at leisurely......

Server message has a problem by 8.3beta4 on windows.

The situation is this.

1. initdb -E UTF-8 --no-locale
This is C locale.
http://winpg.jp/~saito/pg83/postgresql-8.3beta4_info2.png

2. Japanese local message of po file to setting(share/locale/ja) .

3. set the client_encoding is SJIS.
http://winpg.jp/~saito/pg83/postgresql-8.3beta4_info1.png

4. action error message is made to send from server.
It is crash....
http://winpg.jp/~saito/pg83/postgresql-8.3beta4_crash.png

5. The reason is because the message which a server outputs is SJIS.
http://winpg.jp/~saito/pg83/postgresql-8.3beta4_crash.log

Version 8.2.x outputs an English message. It did not look at a problem.
Then, I consider as LC_MESSAGE for a server message, or wish a back patch.

Is there any good solution method? 

Regards,
Hiroshi Saito







Re: Problem of a server gettext message.

From
Peter Eisentraut
Date:
Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
> 2. Japanese local message of po file to setting(share/locale/ja) .

Could we see the contents of this file?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Problem of a server gettext message.

From
"Hiroshi Saito"
Date:
Hi Peter-san.

It is this.
http://winpg.jp/~saito/pg83/ja.zip

Regards,
Hiroshi Saito

----- Original Message ----- 
From: "Peter Eisentraut" <peter_e@gmx.net>


> Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
>> 2. Japanese local message of po file to setting(share/locale/ja) .
> 
> Could we see the contents of this file?
> 
> -- 
> Peter Eisentraut
> http://developer.postgresql.org/~petere/


Re: Problem of a server gettext message.

From
Peter Eisentraut
Date:
Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
> Hi Peter-san.
>
> It is this.
> http://winpg.jp/~saito/pg83/ja.zip

Sorry, we need the *po* (text) files, not the *mo* (binary) files.
-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Problem of a server gettext message.

From
"Hiroshi Saito"
Date:
Hi.

From: "Peter Eisentraut" <peter_e@gmx.net>


> Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
>> Hi Peter-san.
>>
>> It is this.
>> http://winpg.jp/~saito/pg83/ja.zip
> 
> Sorry, we need the *po* (text) files, not the *mo* (binary) files.

Ooops, Although it is an object for Version 8.2.5.
http://www.postgresql.jp/wg/jpugdoc/po/postgresql-8-2-5-nls-patch.gz


Re: Problem of a server gettext message.

From
Peter Eisentraut
Date:
Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
> Hi.
>
> From: "Peter Eisentraut" <peter_e@gmx.net>
>
> > Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
> >> Hi Peter-san.
> >>
> >> It is this.
> >> http://winpg.jp/~saito/pg83/ja.zip
> >
> > Sorry, we need the *po* (text) files, not the *mo* (binary) files.
>
> Ooops, Although it is an object for Version 8.2.5.
> http://www.postgresql.jp/wg/jpugdoc/po/postgresql-8-2-5-nls-patch.gz

OK, you have

PO file in EUC-JP
server encoding UTF-8
client encoding SJIS

When the server wants to send an error message to the client, it will convert 
them from the server to the client encoding.  The English messages are ASCII, 
so this will work, because server encodings are required to be ASCII 
compatible.  The result of the gettext calls, however, is encoded in EUC-JP, 
so the server will take the EUC-JP bytes and attempt to do a UTF-8 to SJIS 
conversion on them.  This will cause a crash.

What you need to do is set the locale to something compatible with the server 
encoding (e.g., ja_JP.utf8).  Then gettext will recode its EUC-JP data to 
UTF-8 before it is sent to the server.  More specifically, you need to set 
the LC_CTYPE locale category to make this happen.  I understand that users in 
Japanese environments like to keep the LC_COLLATE setting to C, and you 
should still be able to do that.  But without a proper LC_CTYPE setting, this 
will not work.

(That is the explanation for Linux.  Windows might be different in the 
details, but I suspect it has the same mechanisms.)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Problem of a server gettext message.

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> When the server wants to send an error message to the client, it will
> convert them from the server to the client encoding.  The English
> messages are ASCII, so this will work, because server encodings are
> required to be ASCII compatible.  The result of the gettext calls,
> however, is encoded in EUC-JP, so the server will take the EUC-JP
> bytes and attempt to do a UTF-8 to SJIS conversion on them.  This will
> cause a crash.

The problem here basically comes from the fact that gettext looks to
LC_CTYPE to decide what encoding it's supposed to convert to (and I
suppose it punts when LC_CTYPE = C).  Does it have a way by which we
could override that, to tell it the actual DB encoding regardless
of the locale environment?
        regards, tom lane


Re: Problem of a server gettext message.

From
"Hiroshi Saito"
Date:
Hi Peter-san.

Thank you for various. !

----- Original Message ----- 
From: "Peter Eisentraut" <peter_e@gmx.net>


> Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
>> Hi.
>>
>> From: "Peter Eisentraut" <peter_e@gmx.net>
>>
>> > Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito:
>> >> Hi Peter-san.
>> >>
>> >> It is this.
>> >> http://winpg.jp/~saito/pg83/ja.zip
>> >
>> > Sorry, we need the *po* (text) files, not the *mo* (binary) files.
>>
>> Ooops, Although it is an object for Version 8.2.5.
>> http://www.postgresql.jp/wg/jpugdoc/po/postgresql-8-2-5-nls-patch.gz
> 
> OK, you have
> 
> PO file in EUC-JP
> server encoding UTF-8
> client encoding SJIS

Yes.

> 
> When the server wants to send an error message to the client, it will convert 
> them from the server to the client encoding.  The English messages are ASCII, 
> so this will work, because server encodings are required to be ASCII 
> compatible.  The result of the gettext calls, however, is encoded in EUC-JP, 
> so the server will take the EUC-JP bytes and attempt to do a UTF-8 to SJIS 
> conversion on them.  This will cause a crash.

Probably no.
GetText is conversion po(EUC_JP) to SJIS. Then, The stderr output of a server is 
outputted without an error to log by it. That's right message with it similar to start-up.
However, The conversion obstacle of a message is encountered at the time of the 
conditions returned to a client. Conversion of the step of the following it takes place. 

1. iconv(GetText)
po(EUC_JP) to SJIS.
2. message to client
UTF8(server encoding) to SJIS(client encoding)
But, this character that should be UTF-8 is a SJIS message(1.).
It causes an error.

Therefore, this log is proving.
http://winpg.jp/~saito/pg83/postgresql-8.3beta4_crash.log
Anyway, the current situation is it although there is a problem..

> 
> What you need to do is set the locale to something compatible with the server 
> encoding (e.g., ja_JP.utf8).  Then gettext will recode its EUC-JP data to 
> UTF-8 before it is sent to the server.  More specifically, you need to set 
> the LC_CTYPE locale category to make this happen.  I understand that users in 
> Japanese environments like to keep the LC_COLLATE setting to C, and you 
> should still be able to do that.  But without a proper LC_CTYPE setting, this 
> will not work.
> 
> (That is the explanation for Linux.  Windows might be different in the 
> details, but I suspect it has the same mechanisms.)

As for message, the current state is not such.... probably..
It is the problem which arises only by the server with client encoding which can't be 
used as server encoding. It may be a problem of Japan... If a message text is not 
used by the server, a problem does not occur. Therefore, It is TODO until it has 
the margin of time. sorry... I'm very busy now...

I am deeply grateful to you for your kindness.

Regards,
Hiroshi Saito


Re: Problem of a server gettext message.

From
Tom Lane
Date:
"Hiroshi Saito" <z-saito@guitar.ocn.ne.jp> writes:
> Probably no.
> GetText is conversion po(EUC_JP) to SJIS. Then, The stderr output of a server is 
> outputted without an error to log by it. That's right message with it similar to start-up.
> However, The conversion obstacle of a message is encountered at the time of the 
> conditions returned to a client. Conversion of the step of the following it takes place. 

> 1. iconv(GetText)
> po(EUC_JP) to SJIS.
> 2. message to client
> UTF8(server encoding) to SJIS(client encoding)
> But, this character that should be UTF-8 is a SJIS message(1.).
> It causes an error.

Are you sure about that?  Why would gettext be converting to SJIS, when
SJIS is nowhere in the environment it can see?  I believe that Peter's
hypothesis is that gettext is leaving the string in EUC_JP because
it sees locale = C and so has no basis for doing any conversion.

We still end up with a failure, because the basic problem is that the
string isn't UTF8, but it's important to be sure we understand the exact
mechanism.
        regards, tom lane


Re: Problem of a server gettext message.

From
"Hiroshi Saito"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us>

> Are you sure about that?  Why would gettext be converting to SJIS, when
> SJIS is nowhere in the environment it can see?  I believe that Peter's
> hypothesis is that gettext is leaving the string in EUC_JP because
> it sees locale = C and so has no basis for doing any conversion.
> 
> We still end up with a failure, because the basic problem is that the
> string isn't UTF8, but it's important to be sure we understand the exact
> mechanism.

Um, It is a simple GetText program. 
http://winpg.jp/~saito/pg83/message_check/gtext.c

for example..
http://winpg.jp/~saito/pg83/message_check/gettext_932.png
http://winpg.jp/~saito/pg83/message_check/C_message.txt
http://winpg.jp/~saito/pg83/message_check/Non_message.txt
http://winpg.jp/~saito/pg83/message_check/UTF8_message.txt
http://winpg.jp/~saito/pg83/message_check/Japanese_message.txt
All are SJIS outputs.

However, chcp 1252
http://winpg.jp/~saito/pg83/message_check/gettext_1252.png

Regards,
Hiroshi Saito



Re: Problem of a server gettext message.

From
"Zeugswetter Andreas ADI SD"
Date:
> > GetText is conversion po(EUC_JP) to SJIS.

Yes.

> Are you sure about that?  Why would gettext be converting to SJIS,
when
> SJIS is nowhere in the environment it can see?

gettext is using GetACP () on Windows, wherever that gets it's info from
...
"chcp" did change the GetACP codepage in Hiroshi's example, but chcp
does not reflect in LC_*

Seems we may want to use bind_textdomain_codeset.

Andreas


Re: Problem of a server gettext message.

From
"Hiroshi Saito"
Date:
Hi.

Yeah, As a part from which a problem happens, it is your suggestion.

This is only the check. 
http://winpg.jp/~saito/pg83/message_check/gtext2.c
Therefore, a message needed is acquirable in the next operation.
gtext2 C UTF-8
http://winpg.jp/~saito/pg83/message_check/codeset_utf8_msg.txt
gtext2 C EUC_JP
http://winpg.jp/~saito/pg83/message_check/codeset_eucjp_msg.txt

However, The check of accuracy is not settled yet. If all server encodings 
are possible, I will want to work. But but, It is not desirable that more 
encodings are intermingled as a log message.... Then, here is no still good 
method. Furthermore, a good solution plan is desired. probably..

Thanks!

Regards,
Hiroshi Saito

----- Original Message ----- 
From: "Zeugswetter Andreas ADI SD" <Andreas.Zeugswetter@s-itsolutions.at>

> > GetText is conversion po(EUC_JP) to SJIS.

Yes.

> Are you sure about that?  Why would gettext be converting to SJIS,
when
> SJIS is nowhere in the environment it can see?

gettext is using GetACP () on Windows, wherever that gets it's info from
...
"chcp" did change the GetACP codepage in Hiroshi's example, but chcp
does not reflect in LC_*

Seems we may want to use bind_textdomain_codeset.

Andreas