Re: BUG #11550: Error messages contain not encodable characters (Latin9) - Mailing list pgsql-bugs

From Walter Willmertinger
Subject Re: BUG #11550: Error messages contain not encodable characters (Latin9)
Date
Msg-id CAHbMG0V1Seo+GVw9rC8XRG18YqtiWx0HLiVRV-m2C_a1JmhHAg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #11550: Error messages contain not encodable characters (Latin9)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-bugs
I think, the most easy way would be to change in a next release the error
messages containing only standard ansi characters, so just " and ' should
be used.
If someone uses an "umlaut" in his field or table name, he/she can do this,
because he/she created the table or field name with his own client encoding=
.

In PG 8 versions we never had this problem, as no delimiting "bad"
characters where used in error messages. They were introduced in some PG 9
version and since them we have a lot of problems.

For example if you have a complicated sql script using Latin9
client_encoding and all output you get is
    "character with byte sequence 0xe2 0x80 0x9e in encoding UTF8 has no
equivalent in LATIN9",
you have to edit the script and do not see the real error. More problem
arise, if you have a Delphi application and get this error message.

- Walter

2014-10-03 9:21 GMT+02:00 Heikki Linnakangas <hlinnakangas@vmware.com>:

> On 10/03/2014 03:15 AM, Bruce Momjian wrote:
>
>> On Wed, Oct  1, 2014 at 08:09:23PM +0000, willmis@gmail.com wrote:
>>
>>> If we set client_encoding to Latin9 (as we are here in Germany), we get
>>> an
>>> error message from PostgreSQL:
>>>
>>> character with byte sequence 0xe2 0x80 0x9e in encoding UTF8 has no
>>> equivalent in LATIN9
>>>
>>> This behaviour leads to problems in tools we use like Zeos etc.
>>>
>>> Is there a way to change the delimiters in messages by some way?
>>>
>>
>> No.
>>
>
> Well, you could manually search & replace the .po files and run msgfmt on
> them. But no, there's no easy way out.
>
> Can we fix this? It's annoying that you can't use LATIN9 with a German
> locale, as that is the most common encoding used with German, aside from
> UTF-8.
>
> In general, it's annoying that you get errors like that if you use an
> encoding that can't represent all the characters in error messages. In
> situations like this, it would be clearly better to transliterate the
> quotation marks to " or =C2=AB=C2=BB. Also with umlauts (=C3=A4=C3=B6=C3=
=BC), it would be better to
> transliterate them to ae, oe, ue, than to throw an error.
>
> Gettext does perform translitteration, but the problem is that we first
> convert the error message to the server encoding, using gettext, and then
> convert from server encoding to the client encoding ourselves. It would
> make more sense to let gettext convert directly to the client encoding. W=
e
> currently construct all the messages in server encoding and convert to
> client encoding just before sending to the client, so changing that would
> require some care to keep track which messages are already in client
> encoding and which ones need conversion. But if someone is willing to put
> some effort to it, it seems doable.
>
> - Heikki
>

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BUG #11550: Error messages contain not encodable characters (Latin9)
Next
From: bryan_seitz@symantec.com
Date:
Subject: BUG #11555: Postgresql