Thread: Problem of a server gettext message.
Hi. I think this has many problems. However, by the reason the release is approaching, this is not the situation which I'm looking at leisurely...... Server message has a problem by 8.3beta4 on windows. The situation is this. 1. initdb -E UTF-8 --no-locale This is C locale. http://winpg.jp/~saito/pg83/postgresql-8.3beta4_info2.png 2. Japanese local message of po file to setting(share/locale/ja) . 3. set the client_encoding is SJIS. http://winpg.jp/~saito/pg83/postgresql-8.3beta4_info1.png 4. action error message is made to send from server. It is crash.... http://winpg.jp/~saito/pg83/postgresql-8.3beta4_crash.png 5. The reason is because the message which a server outputs is SJIS. http://winpg.jp/~saito/pg83/postgresql-8.3beta4_crash.log Version 8.2.x outputs an English message. It did not look at a problem. Then, I consider as LC_MESSAGE for a server message, or wish a back patch. Is there any good solution method? Regards, Hiroshi Saito
Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: > 2. Japanese local message of po file to setting(share/locale/ja) . Could we see the contents of this file? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Hi Peter-san. It is this. http://winpg.jp/~saito/pg83/ja.zip Regards, Hiroshi Saito ----- Original Message ----- From: "Peter Eisentraut" <peter_e@gmx.net> > Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: >> 2. Japanese local message of po file to setting(share/locale/ja) . > > Could we see the contents of this file? > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/
Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: > Hi Peter-san. > > It is this. > http://winpg.jp/~saito/pg83/ja.zip Sorry, we need the *po* (text) files, not the *mo* (binary) files. -- Peter Eisentraut http://developer.postgresql.org/~petere/
Hi. From: "Peter Eisentraut" <peter_e@gmx.net> > Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: >> Hi Peter-san. >> >> It is this. >> http://winpg.jp/~saito/pg83/ja.zip > > Sorry, we need the *po* (text) files, not the *mo* (binary) files. Ooops, Although it is an object for Version 8.2.5. http://www.postgresql.jp/wg/jpugdoc/po/postgresql-8-2-5-nls-patch.gz
Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: > Hi. > > From: "Peter Eisentraut" <peter_e@gmx.net> > > > Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: > >> Hi Peter-san. > >> > >> It is this. > >> http://winpg.jp/~saito/pg83/ja.zip > > > > Sorry, we need the *po* (text) files, not the *mo* (binary) files. > > Ooops, Although it is an object for Version 8.2.5. > http://www.postgresql.jp/wg/jpugdoc/po/postgresql-8-2-5-nls-patch.gz OK, you have PO file in EUC-JP server encoding UTF-8 client encoding SJIS When the server wants to send an error message to the client, it will convert them from the server to the client encoding. The English messages are ASCII, so this will work, because server encodings are required to be ASCII compatible. The result of the gettext calls, however, is encoded in EUC-JP, so the server will take the EUC-JP bytes and attempt to do a UTF-8 to SJIS conversion on them. This will cause a crash. What you need to do is set the locale to something compatible with the server encoding (e.g., ja_JP.utf8). Then gettext will recode its EUC-JP data to UTF-8 before it is sent to the server. More specifically, you need to set the LC_CTYPE locale category to make this happen. I understand that users in Japanese environments like to keep the LC_COLLATE setting to C, and you should still be able to do that. But without a proper LC_CTYPE setting, this will not work. (That is the explanation for Linux. Windows might be different in the details, but I suspect it has the same mechanisms.) -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > When the server wants to send an error message to the client, it will > convert them from the server to the client encoding. The English > messages are ASCII, so this will work, because server encodings are > required to be ASCII compatible. The result of the gettext calls, > however, is encoded in EUC-JP, so the server will take the EUC-JP > bytes and attempt to do a UTF-8 to SJIS conversion on them. This will > cause a crash. The problem here basically comes from the fact that gettext looks to LC_CTYPE to decide what encoding it's supposed to convert to (and I suppose it punts when LC_CTYPE = C). Does it have a way by which we could override that, to tell it the actual DB encoding regardless of the locale environment? regards, tom lane
Hi Peter-san. Thank you for various. ! ----- Original Message ----- From: "Peter Eisentraut" <peter_e@gmx.net> > Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: >> Hi. >> >> From: "Peter Eisentraut" <peter_e@gmx.net> >> >> > Am Montag, 10. Dezember 2007 schrieb Hiroshi Saito: >> >> Hi Peter-san. >> >> >> >> It is this. >> >> http://winpg.jp/~saito/pg83/ja.zip >> > >> > Sorry, we need the *po* (text) files, not the *mo* (binary) files. >> >> Ooops, Although it is an object for Version 8.2.5. >> http://www.postgresql.jp/wg/jpugdoc/po/postgresql-8-2-5-nls-patch.gz > > OK, you have > > PO file in EUC-JP > server encoding UTF-8 > client encoding SJIS Yes. > > When the server wants to send an error message to the client, it will convert > them from the server to the client encoding. The English messages are ASCII, > so this will work, because server encodings are required to be ASCII > compatible. The result of the gettext calls, however, is encoded in EUC-JP, > so the server will take the EUC-JP bytes and attempt to do a UTF-8 to SJIS > conversion on them. This will cause a crash. Probably no. GetText is conversion po(EUC_JP) to SJIS. Then, The stderr output of a server is outputted without an error to log by it. That's right message with it similar to start-up. However, The conversion obstacle of a message is encountered at the time of the conditions returned to a client. Conversion of the step of the following it takes place. 1. iconv(GetText) po(EUC_JP) to SJIS. 2. message to client UTF8(server encoding) to SJIS(client encoding) But, this character that should be UTF-8 is a SJIS message(1.). It causes an error. Therefore, this log is proving. http://winpg.jp/~saito/pg83/postgresql-8.3beta4_crash.log Anyway, the current situation is it although there is a problem.. > > What you need to do is set the locale to something compatible with the server > encoding (e.g., ja_JP.utf8). Then gettext will recode its EUC-JP data to > UTF-8 before it is sent to the server. More specifically, you need to set > the LC_CTYPE locale category to make this happen. I understand that users in > Japanese environments like to keep the LC_COLLATE setting to C, and you > should still be able to do that. But without a proper LC_CTYPE setting, this > will not work. > > (That is the explanation for Linux. Windows might be different in the > details, but I suspect it has the same mechanisms.) As for message, the current state is not such.... probably.. It is the problem which arises only by the server with client encoding which can't be used as server encoding. It may be a problem of Japan... If a message text is not used by the server, a problem does not occur. Therefore, It is TODO until it has the margin of time. sorry... I'm very busy now... I am deeply grateful to you for your kindness. Regards, Hiroshi Saito
"Hiroshi Saito" <z-saito@guitar.ocn.ne.jp> writes: > Probably no. > GetText is conversion po(EUC_JP) to SJIS. Then, The stderr output of a server is > outputted without an error to log by it. That's right message with it similar to start-up. > However, The conversion obstacle of a message is encountered at the time of the > conditions returned to a client. Conversion of the step of the following it takes place. > 1. iconv(GetText) > po(EUC_JP) to SJIS. > 2. message to client > UTF8(server encoding) to SJIS(client encoding) > But, this character that should be UTF-8 is a SJIS message(1.). > It causes an error. Are you sure about that? Why would gettext be converting to SJIS, when SJIS is nowhere in the environment it can see? I believe that Peter's hypothesis is that gettext is leaving the string in EUC_JP because it sees locale = C and so has no basis for doing any conversion. We still end up with a failure, because the basic problem is that the string isn't UTF8, but it's important to be sure we understand the exact mechanism. regards, tom lane
From: "Tom Lane" <tgl@sss.pgh.pa.us> > Are you sure about that? Why would gettext be converting to SJIS, when > SJIS is nowhere in the environment it can see? I believe that Peter's > hypothesis is that gettext is leaving the string in EUC_JP because > it sees locale = C and so has no basis for doing any conversion. > > We still end up with a failure, because the basic problem is that the > string isn't UTF8, but it's important to be sure we understand the exact > mechanism. Um, It is a simple GetText program. http://winpg.jp/~saito/pg83/message_check/gtext.c for example.. http://winpg.jp/~saito/pg83/message_check/gettext_932.png http://winpg.jp/~saito/pg83/message_check/C_message.txt http://winpg.jp/~saito/pg83/message_check/Non_message.txt http://winpg.jp/~saito/pg83/message_check/UTF8_message.txt http://winpg.jp/~saito/pg83/message_check/Japanese_message.txt All are SJIS outputs. However, chcp 1252 http://winpg.jp/~saito/pg83/message_check/gettext_1252.png Regards, Hiroshi Saito
> > GetText is conversion po(EUC_JP) to SJIS. Yes. > Are you sure about that? Why would gettext be converting to SJIS, when > SJIS is nowhere in the environment it can see? gettext is using GetACP () on Windows, wherever that gets it's info from ... "chcp" did change the GetACP codepage in Hiroshi's example, but chcp does not reflect in LC_* Seems we may want to use bind_textdomain_codeset. Andreas
Hi. Yeah, As a part from which a problem happens, it is your suggestion. This is only the check. http://winpg.jp/~saito/pg83/message_check/gtext2.c Therefore, a message needed is acquirable in the next operation. gtext2 C UTF-8 http://winpg.jp/~saito/pg83/message_check/codeset_utf8_msg.txt gtext2 C EUC_JP http://winpg.jp/~saito/pg83/message_check/codeset_eucjp_msg.txt However, The check of accuracy is not settled yet. If all server encodings are possible, I will want to work. But but, It is not desirable that more encodings are intermingled as a log message.... Then, here is no still good method. Furthermore, a good solution plan is desired. probably.. Thanks! Regards, Hiroshi Saito ----- Original Message ----- From: "Zeugswetter Andreas ADI SD" <Andreas.Zeugswetter@s-itsolutions.at> > > GetText is conversion po(EUC_JP) to SJIS. Yes. > Are you sure about that? Why would gettext be converting to SJIS, when > SJIS is nowhere in the environment it can see? gettext is using GetACP () on Windows, wherever that gets it's info from ... "chcp" did change the GetACP codepage in Hiroshi's example, but chcp does not reflect in LC_* Seems we may want to use bind_textdomain_codeset. Andreas