Thread: Trouble with error message encoding
I have encoding problems using translated error messages (7.4beta1). When database encoding is set to SQL_ASCII, all mesages arrive to client correctly respecting the CLIENT_ENCODING, but if I create database WITH ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed correctly only when CLIENT_ENCODING is same as database encoding. I checked, and this is working this way also in 7.3. Is that known problem, or maybe I'm doing something wrong? Regards !
Darko Prenosil writes: > I have encoding problems using translated error messages (7.4beta1). > When database encoding is set to SQL_ASCII, all mesages arrive to client > correctly respecting the CLIENT_ENCODING, but if I create database WITH > ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed > correctly only when CLIENT_ENCODING is same as database encoding. > I checked, and this is working this way also in 7.3. Is that known problem, or > maybe I'm doing something wrong? In general, the server encoding is S, the client encoding is C, and the messages are stored (in the source, or in the PO files) in encoding M. When the server sends a message to the client, it tries to convert a string of encoding M, thinking it is in encoding S, to encoding C. So, yes, there is a problem, but it's not easy to fix. -- Peter Eisentraut peter_e@gmx.net
----- Original Message ----- From: "Peter Eisentraut" <peter_e@gmx.net> To: "Darko Prenosil" <darko.prenosil@finteh.hr> Cc: <pgsql-hackers@postgresql.org> Sent: Wednesday, September 10, 2003 7:20 PM Subject: Re: [HACKERS] Trouble with error message encoding > Darko Prenosil writes: > > > I have encoding problems using translated error messages (7.4beta1). > > When database encoding is set to SQL_ASCII, all mesages arrive to client > > correctly respecting the CLIENT_ENCODING, but if I create database WITH > > ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed > > correctly only when CLIENT_ENCODING is same as database encoding. > > I checked, and this is working this way also in 7.3. Is that known problem, or > > maybe I'm doing something wrong? > > In general, the server encoding is S, the client encoding is C, and the > messages are stored (in the source, or in the PO files) in encoding M. > When the server sends a message to the client, it tries to convert a > string of encoding M, thinking it is in encoding S, to encoding C. So, > yes, there is a problem, but it's not easy to fix. > I found quick and I believe dirty solution for this problem, so I need opinion from hackers. Here is the idea: there is problem to find out in which encoding is using mo file, but we can force gettext to serve known encoding for example utf8. After that we can always convert from unicode to client encoding. In /src/backend/main/main.c : #ifdef ENABLE_NLSbindtextdomain("postgres", LOCALEDIR);bind_textdomain_codeset("postgres", "utf8");textdomain("postgres"); #endif in /src/backend/utils/error/elog.c #define EVALUATE_MESSAGE(targetfield, appendval) \{ \ char *fmtbuf; \ StringInfoData buf; \ /* Internationalize theerror format string */ \ fmt = gettext(fmt); \ fmt = pg_server_to_client((unsigned char*)fmt, strlen(fmt)); \ .... Of course this is working only for backend messages, but this was enough for testing. I did a quick test on database created with 'latin2' and I got correctly encoded messages for latin2, unicode and sql_ascii client encoding. I realize that this way message is translated 2 times: by gettext and pg_server_to_client, but after all we want as less error messages as possible :-) Sorry if the whole Idea is stupid, but I could not resist. Regards !
Darko Prenosil writes: > Here is the idea: there is problem to find out in which encoding is using mo > file, but we can force gettext to serve known encoding for example utf8. > After that we can always convert from unicode to client encoding. Hmm, I've never heard of bind_textdomain_codeset(). How portable is it? -- Peter Eisentraut peter_e@gmx.net
On Thursday 11 September 2003 20:13, Peter Eisentraut wrote: > Darko Prenosil writes: > > Here is the idea: there is problem to find out in which encoding is using > > mo file, but we can force gettext to serve known encoding for example > > utf8. After that we can always convert from unicode to client encoding. > > Hmm, I've never heard of bind_textdomain_codeset(). How portable is it? I send message Yesterday, but it looks like it did not make through. See: http://www.gnu.org/manual/gettext/ It is according to that documentation standard part of GNU gettext. Few Gnome applications are using it - saw that on mailing lists. Also I did found it in UNIX gettext package documentation. I do not know about other platforms. Sorry if You already got previous message, but I do not see it on the list. P.S. I messed up that line in elog.c, because conversion should go from utf8 source, but You understand the idea. Regards