Thread: BUG #11550: Error messages contain not encodable characters (Latin9)
The following bug has been logged on the website: Bug reference: 11550 Logged by: Walter W. Email address: willmis@gmail.com PostgreSQL version: 9.3.5 Operating system: Linux/Windows Description: In 9.3 we have new characters for delimiting words. An example: "Drop table if exists mickeymouse;" delivers in PG-9.3 HINWEIS: Tabelle âmickeymouseâ existiert nicht, wird übersprungen but delivers in PG-8.4 HINWEIS: Tabelle »mickeymouse« existiert nicht, wird übersprungen If we set client_encoding to Latin9 (as we are here in Germany), we get an error message from PostgreSQL: character with byte sequence 0xe2 0x80 0x9e in encoding UTF8 has no equivalent in LATIN9 This behaviour leads to problems in tools we use like Zeos etc. Is there a way to change the delimiters in messages by some way? This would help us a lot.
On Wed, Oct 1, 2014 at 08:09:23PM +0000, willmis@gmail.com wrote: > If we set client_encoding to Latin9 (as we are here in Germany), we get an > error message from PostgreSQL: > > character with byte sequence 0xe2 0x80 0x9e in encoding UTF8 has no > equivalent in LATIN9 > > This behaviour leads to problems in tools we use like Zeos etc. > > Is there a way to change the delimiters in messages by some way? No. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Re: BUG #11550: Error messages contain not encodable characters (Latin9)
From
Heikki Linnakangas
Date:
On 10/03/2014 03:15 AM, Bruce Momjian wrote: > On Wed, Oct 1, 2014 at 08:09:23PM +0000, willmis@gmail.com wrote: >> If we set client_encoding to Latin9 (as we are here in Germany), we get an >> error message from PostgreSQL: >> >> character with byte sequence 0xe2 0x80 0x9e in encoding UTF8 has no >> equivalent in LATIN9 >> >> This behaviour leads to problems in tools we use like Zeos etc. >> >> Is there a way to change the delimiters in messages by some way? > > No. Well, you could manually search & replace the .po files and run msgfmt on them. But no, there's no easy way out. Can we fix this? It's annoying that you can't use LATIN9 with a German locale, as that is the most common encoding used with German, aside from UTF-8. In general, it's annoying that you get errors like that if you use an encoding that can't represent all the characters in error messages. In situations like this, it would be clearly better to transliterate the quotation marks to " or «». Also with umlauts (äöü), it would be better to transliterate them to ae, oe, ue, than to throw an error. Gettext does perform translitteration, but the problem is that we first convert the error message to the server encoding, using gettext, and then convert from server encoding to the client encoding ourselves. It would make more sense to let gettext convert directly to the client encoding. We currently construct all the messages in server encoding and convert to client encoding just before sending to the client, so changing that would require some care to keep track which messages are already in client encoding and which ones need conversion. But if someone is willing to put some effort to it, it seems doable. - Heikki
Re: BUG #11550: Error messages contain not encodable characters (Latin9)
From
Walter Willmertinger
Date:
I think, the most easy way would be to change in a next release the error messages containing only standard ansi characters, so just " and ' should be used. If someone uses an "umlaut" in his field or table name, he/she can do this, because he/she created the table or field name with his own client encoding= . In PG 8 versions we never had this problem, as no delimiting "bad" characters where used in error messages. They were introduced in some PG 9 version and since them we have a lot of problems. For example if you have a complicated sql script using Latin9 client_encoding and all output you get is "character with byte sequence 0xe2 0x80 0x9e in encoding UTF8 has no equivalent in LATIN9", you have to edit the script and do not see the real error. More problem arise, if you have a Delphi application and get this error message. - Walter 2014-10-03 9:21 GMT+02:00 Heikki Linnakangas <hlinnakangas@vmware.com>: > On 10/03/2014 03:15 AM, Bruce Momjian wrote: > >> On Wed, Oct 1, 2014 at 08:09:23PM +0000, willmis@gmail.com wrote: >> >>> If we set client_encoding to Latin9 (as we are here in Germany), we get >>> an >>> error message from PostgreSQL: >>> >>> character with byte sequence 0xe2 0x80 0x9e in encoding UTF8 has no >>> equivalent in LATIN9 >>> >>> This behaviour leads to problems in tools we use like Zeos etc. >>> >>> Is there a way to change the delimiters in messages by some way? >>> >> >> No. >> > > Well, you could manually search & replace the .po files and run msgfmt on > them. But no, there's no easy way out. > > Can we fix this? It's annoying that you can't use LATIN9 with a German > locale, as that is the most common encoding used with German, aside from > UTF-8. > > In general, it's annoying that you get errors like that if you use an > encoding that can't represent all the characters in error messages. In > situations like this, it would be clearly better to transliterate the > quotation marks to " or =C2=AB=C2=BB. Also with umlauts (=C3=A4=C3=B6=C3= =BC), it would be better to > transliterate them to ae, oe, ue, than to throw an error. > > Gettext does perform translitteration, but the problem is that we first > convert the error message to the server encoding, using gettext, and then > convert from server encoding to the client encoding ourselves. It would > make more sense to let gettext convert directly to the client encoding. W= e > currently construct all the messages in server encoding and convert to > client encoding just before sending to the client, so changing that would > require some care to keep track which messages are already in client > encoding and which ones need conversion. But if someone is willing to put > some effort to it, it seems doable. > > - Heikki >
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > Can we fix this? It's annoying that you can't use LATIN9 with a German > locale, as that is the most common encoding used with German, aside from > UTF-8. I would think that would be a matter to be taken up with the translation people. If a particular set of translated messages is using quote characters that don't exist in every encoding commonly used with that language, then it's arguable that the translator made a poor choice of quote characters. (Likewise for any other special characters of course.) regards, tom lane
Re: BUG #11550: Error messages contain not encodable characters (Latin9)
From
Walter Willmertinger
Date:
I think, this would be great. As otherwise, PG 9 is very hard to use here in West Europe. Also the way to change .po files seems to be a lot of work. We just use the windows binaries and there are no .po files to find. We hope for some change in the near future and could help (in reviews or ..) if necessary. Regards Walter 2014-10-03 15:42 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: > > Can we fix this? It's annoying that you can't use LATIN9 with a German > > locale, as that is the most common encoding used with German, aside from > > UTF-8. > > I would think that would be a matter to be taken up with the translation > people. If a particular set of translated messages is using quote > characters that don't exist in every encoding commonly used with that > language, then it's arguable that the translator made a poor choice > of quote characters. (Likewise for any other special characters of > course.) > > regards, tom lane >
Re: BUG #11550: Error messages contain not encodable characters (Latin9)
From
Walter Willmertinger
Date:
It's a pity, but these problems were not corrected until now (we just tried 9.4.4). The error messages still contain the same unprintable (in Latin9) characters. Please translation team, could you look for this (here in Germany very heavy) problem? By the way, we found a way to help us: We just rename the language directory share/locale/de to share/locale/de-nix, so Postgresql cannot find the correct language files and we get english error messages. Regards Walter Willmertinger
Re: BUG #11550: Error messages contain not encodable characters (Latin9)
From
Peter Eisentraut
Date:
On 6/24/15 10:09 AM, Walter Willmertinger wrote: > It's a pity, but these problems were not corrected until now (we just > tried 9.4.4). The error messages still contain the same unprintable (in > Latin9) characters. This will be fixed (for German) in the next minor releases. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services