Thread: Trouble with error message encoding

Trouble with error message encoding

From

Darko Prenosil

Date:

10 September 2003, 05:51:00

    I have encoding problems using translated error messages (7.4beta1).
When database encoding is set to SQL_ASCII, all mesages arrive to client 
correctly respecting the CLIENT_ENCODING, but if I create database WITH 
ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed 
correctly only when CLIENT_ENCODING is same as database encoding.
I checked, and this is working this way also in 7.3. Is that known problem, or 
maybe I'm doing something wrong? 

Regards !

Re: Trouble with error message encoding

From

Peter Eisentraut

Date:

10 September 2003, 14:22:50

Darko Prenosil writes:

>     I have encoding problems using translated error messages (7.4beta1).
> When database encoding is set to SQL_ASCII, all mesages arrive to client
> correctly respecting the CLIENT_ENCODING, but if I create database WITH
> ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed
> correctly only when CLIENT_ENCODING is same as database encoding.
> I checked, and this is working this way also in 7.3. Is that known problem, or
> maybe I'm doing something wrong?

In general, the server encoding is S, the client encoding is C, and the
messages are stored (in the source, or in the PO files) in encoding M.
When the server sends a message to the client, it tries to convert a
string of encoding M, thinking it is in encoding S, to encoding C.  So,
yes, there is a problem, but it's not easy to fix.

-- 
Peter Eisentraut   peter_e@gmx.net

Re: Trouble with error message encoding

From

"Darko Prenosil"

Date:

10 September 2003, 18:55:57

----- Original Message -----
From: "Peter Eisentraut" <peter_e@gmx.net>
To: "Darko Prenosil" <darko.prenosil@finteh.hr>
Cc: <pgsql-hackers@postgresql.org>
Sent: Wednesday, September 10, 2003 7:20 PM
Subject: Re: [HACKERS] Trouble with error message encoding


> Darko Prenosil writes:
>
> > I have encoding problems using translated error messages (7.4beta1).
> > When database encoding is set to SQL_ASCII, all mesages arrive to client
> > correctly respecting the CLIENT_ENCODING, but if I create database WITH
> > ENCODING='unicode' or WITH ENCODING='latin2', messages are displayed
> > correctly only when CLIENT_ENCODING is same as database encoding.
> > I checked, and this is working this way also in 7.3. Is that known
problem, or
> > maybe I'm doing something wrong?
>
> In general, the server encoding is S, the client encoding is C, and the
> messages are stored (in the source, or in the PO files) in encoding M.
> When the server sends a message to the client, it tries to convert a
> string of encoding M, thinking it is in encoding S, to encoding C.  So,
> yes, there is a problem, but it's not easy to fix.
>
I found quick and I believe dirty solution for this problem, so I need
opinion from hackers.
Here is the idea: there is problem to find out in which encoding is using mo
file, but we can force gettext to serve known encoding for example utf8.
After that we can always convert from unicode to client encoding.

In /src/backend/main/main.c :

#ifdef ENABLE_NLSbindtextdomain("postgres", LOCALEDIR);bind_textdomain_codeset("postgres",
"utf8");textdomain("postgres");
#endif

in /src/backend/utils/error/elog.c

#define EVALUATE_MESSAGE(targetfield, appendval)  \{ \ char     *fmtbuf; \ StringInfoData buf; \ /* Internationalize
theerror format string */ \ fmt = gettext(fmt); \ fmt = pg_server_to_client((unsigned char*)fmt, strlen(fmt)); \
 
....

Of course this is working only for backend messages, but this was enough for
testing.
I did a quick test on database created with 'latin2' and I got correctly
encoded messages for latin2, unicode and sql_ascii client encoding.
I realize that this way message is translated 2 times: by gettext and
pg_server_to_client, but after all we want as less error messages as
possible :-)
Sorry if the whole Idea is stupid, but I could not resist.

Regards !

Re: Trouble with error message encoding

From

Peter Eisentraut

Date:

11 September 2003, 15:14:00

Darko Prenosil writes:

> Here is the idea: there is problem to find out in which encoding is using mo
> file, but we can force gettext to serve known encoding for example utf8.
> After that we can always convert from unicode to client encoding.

Hmm, I've never heard of bind_textdomain_codeset().  How portable is it?

-- 
Peter Eisentraut   peter_e@gmx.net

Re: Trouble with error message encoding

From

Darko Prenosil

Date:

12 September 2003, 05:52:35

On Thursday 11 September 2003 20:13, Peter Eisentraut wrote:
> Darko Prenosil writes:
> > Here is the idea: there is problem to find out in which encoding is using
> > mo file, but we can force gettext to serve known encoding for example
> > utf8. After that we can always convert from unicode to client encoding.
>
> Hmm, I've never heard of bind_textdomain_codeset().  How portable is it?

I send message Yesterday, but it looks like it did not make through.

See: http://www.gnu.org/manual/gettext/
It is according to that documentation standard part of GNU gettext. 
Few Gnome applications are using it - saw that on mailing lists.
Also I did found it in UNIX gettext package documentation. 
I do not know about other platforms. 
Sorry if You already got previous message, but I do not see it on the list.

P.S. I messed up that line in elog.c, because conversion should go from utf8 
source, but You understand the idea.

Regards