Thread: "Invalid byte sequence" message
Hi again, i upgraded im pg installation to 9.0.3 (from 8.4.2) and now i'm having trouble looking at my log files with pg admin 1.12.2.(on Mac OS 10.6) - on every refresh I'd get a messagebox saying: 2011-02-15 14:28:12 ERROR : ERROR: invalid byte sequence for encoding "UTF8": 0xe3b66c The server runs on Mac OS 10.6, the data is UTF8, and the client connection settings are as well. Any pointers? Maximilian Tyrtania Software-Entwicklung Dessauer Str. 6-7 10969 Berlin http://www.contactking.de
Just found this in my log file: <postgres%2011-02-16 13:55:32 CET22021>ERROR: invalid byte sequence for encoding "UTF8": 0xe3bc64 <postgres%2011-02-16 13:55:32 CET22021>STATEMENT: SELECT pg_file_read('pg_log/postgresql-2011-02-16_000000.log', 100000,50000) Still not sure what's going on there. Apparently the contents of the logfile are not valid UTF8 characters. Also, after iclicked the message boxes away, the log files contents appear incomplete in the log viewer (a couple hours worth of entriesare simply missing). Maximilian Tyrtania Software-Entwicklung Dessauer Str. 6-7 10969 Berlin http://www.contactking.de Am 15.02.2011 um 16:07 schrieb Maximilian Tyrtania: > Hi again, > > i upgraded im pg installation to 9.0.3 (from 8.4.2) and now i'm having trouble looking at my log files with pg admin 1.12.2.(on Mac OS 10.6) - on every refresh I'd get a messagebox saying: > > 2011-02-15 14:28:12 ERROR : ERROR: invalid byte sequence for encoding "UTF8": 0xe3b66c > > The server runs on Mac OS 10.6, the data is UTF8, and the client connection settings are as well. > Any pointers?
Le 16/02/2011 14:21, Maximilian Tyrtania a écrit : > Just found this in my log file: > > <postgres%2011-02-16 13:55:32 CET22021>ERROR: invalid byte sequence for encoding "UTF8": 0xe3bc64 > <postgres%2011-02-16 13:55:32 CET22021>STATEMENT: SELECT pg_file_read('pg_log/postgresql-2011-02-16_000000.log', 100000,50000) > > Still not sure what's going on there. Apparently the contents of the logfile are not valid UTF8 characters. Also, afteri clicked the message boxes away, the log files contents appear incomplete in the log viewer (a couple hours worth ofentries are simply missing). > I suppose it stopped to process the rest of the file once it found an invalid UTF8 character. There's not much we can do about this. -- Guillaumehttp://www.postgresql.frhttp://dalibo.com
Le 22/02/2011 21:58, Guillaume Lelarge a écrit : > Le 16/02/2011 14:21, Maximilian Tyrtania a écrit : >> Just found this in my log file: >> >> <postgres%2011-02-16 13:55:32 CET22021>ERROR: invalid byte sequence for encoding "UTF8": 0xe3bc64 >> <postgres%2011-02-16 13:55:32 CET22021>STATEMENT: SELECT pg_file_read('pg_log/postgresql-2011-02-16_000000.log', 100000,50000) >> >> Still not sure what's going on there. Apparently the contents of the logfile are not valid UTF8 characters. Also, afteri clicked the message boxes away, the log files contents appear incomplete in the log viewer (a couple hours worth ofentries are simply missing). >> > > I suppose it stopped to process the rest of the file once it found an > invalid UTF8 character. There's not much we can do about this. > > One guy on a french web forum has the same issue than you. Can you tell me the value of your lc_messages parameter? -- Guillaumehttp://www.postgresql.frhttp://dalibo.com
We see this a lot with web applications where users cut/paste from MS-Word. In our case, the web app and db (oracle) are the same character set, so no translation or validation is done. Oracle will store the values, even though they aren't valid UTF8 characters. We run into problems when the values are imported to our Greenplum/postgres dw. We don't have a workaround. Doug -----Original Message----- From: pgadmin-support-owner@postgresql.org [mailto:pgadmin-support-owner@postgresql.org] On Behalf Of Guillaume Lelarge Sent: Wednesday, February 23, 2011 2:41 PM To: Maximilian Tyrtania Cc: pgadmin-support@postgresql.org Subject: Re: [pgadmin-support] "Invalid byte sequence" message Le 22/02/2011 21:58, Guillaume Lelarge a écrit : > Le 16/02/2011 14:21, Maximilian Tyrtania a écrit : >> Just found this in my log file: >> >> <postgres%2011-02-16 13:55:32 CET22021>ERROR: invalid byte sequence for encoding "UTF8": 0xe3bc64 >> <postgres%2011-02-16 13:55:32 CET22021>STATEMENT: SELECT pg_file_read('pg_log/postgresql-2011-02-16_000000.log', 100000,50000) >> >> Still not sure what's going on there. Apparently the contents of the logfile are not valid UTF8 characters. Also, afteri clicked the message boxes away, the log files contents appear incomplete in the log viewer (a couple hours worth ofentries are simply missing). >> > > I suppose it stopped to process the rest of the file once it found an > invalid UTF8 character. There's not much we can do about this. > > One guy on a french web forum has the same issue than you. Can you tell me the value of your lc_messages parameter? -- Guillaumehttp://www.postgresql.frhttp://dalibo.com -- Sent via pgadmin-support mailing list (pgadmin-support@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgadmin-support
<div class="gmail_quote">On Wed, Feb 23, 2011 at 21:41, Guillaume Lelarge <span dir="ltr"><<a href="mailto:guillaume@lelarge.info">guillaume@lelarge.info</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> Le 22/02/2011 21:58, GuillaumeLelarge a écrit :<br /> > Le 16/02/2011 14:21, Maximilian Tyrtania a écrit :<br /> >> Just found this inmy log file:<br /> >><br /> >> <postgres%2011-02-16 13:55:32 CET22021>ERROR: invalid byte sequence forencoding "UTF8": 0xe3bc64<br /> >> <postgres%2011-02-16 13:55:32 CET22021>STATEMENT: SELECT pg_file_read('pg_log/postgresql-2011-02-16_000000.log',100000, 50000)<br /> >><br /> >> Still not sure what'sgoing on there. Apparently the contents of the logfile are not valid UTF8 characters. Also, after i clicked the messageboxes away, the log files contents appear incomplete in the log viewer (a couple hours worth of entries are simplymissing).<br /> >><br /> ><br /> > I suppose it stopped to process the rest of the file once it found an<br/> > invalid UTF8 character. There's not much we can do about this.<br /> ><br /> ><br /><br /> One guy ona french web forum has the same issue than you. Can you tell<br /> me the value of your lc_messages parameter?<br /></blockquote></div><br/>I get it quite easily with LC_MESSAGES = 'French, France' (the installer's default) on a FrenchWindows.<br /><br /> See this unresolved thread for more info: <a href="http://archives.postgresql.org/pgsql-bugs/2010-09/msg00138.php">http://archives.postgresql.org/pgsql-bugs/2010-09/msg00138.php</a><br /><br/>
Le 23/02/2011 22:51, Vik Reykja a écrit : > On Wed, Feb 23, 2011 at 21:41, Guillaume Lelarge <guillaume@lelarge.info>wrote: > >> Le 22/02/2011 21:58, Guillaume Lelarge a écrit : >>> Le 16/02/2011 14:21, Maximilian Tyrtania a écrit : >>>> Just found this in my log file: >>>> >>>> <postgres%2011-02-16 13:55:32 CET22021>ERROR: invalid byte sequence for >> encoding "UTF8": 0xe3bc64 >>>> <postgres%2011-02-16 13:55:32 CET22021>STATEMENT: SELECT >> pg_file_read('pg_log/postgresql-2011-02-16_000000.log', 100000, 50000) >>>> >>>> Still not sure what's going on there. Apparently the contents of the >> logfile are not valid UTF8 characters. Also, after i clicked the message >> boxes away, the log files contents appear incomplete in the log viewer (a >> couple hours worth of entries are simply missing). >>> >>> I suppose it stopped to process the rest of the file once it found an >>> invalid UTF8 character. There's not much we can do about this. >>> >> >> One guy on a french web forum has the same issue than you. Can you tell >> me the value of your lc_messages parameter? >> > > I get it quite easily with LC_MESSAGES = 'French, France' (the installer's > default) on a French Windows. > > See this unresolved thread for more info: > http://archives.postgresql.org/pgsql-bugs/2010-09/msg00138.php > That's what the guy has (see http://forums.postgresql.fr/viewtopic.php?pid=8214#p8214 if you read french). I assume it would work well with lc_messages set to C. Any production server should have lc_messages set to C. -- Guillaumehttp://www.postgresql.frhttp://dalibo.com
On Wed, Feb 23, 2011 at 23:11, Guillaume Lelarge <guillaume@lelarge.info> wrote:
I set it to 'en_US' which has the same effect.
I disagree. Any system that offers to write my messages in French should do so correctly, *especially* if it's the default.
In any case, it's a core PostgreSQL bug, not a PGAdmin bug. It could be dealt with a little more gracefully, though.
> I get it quite easily with LC_MESSAGES = 'French, France' (the installer's> default) on a French Windows.
>
> See this unresolved thread for more info:
> http://archives.postgresql.org/pgsql-bugs/2010-09/msg00138.php
>
That's what the guy has (see
http://forums.postgresql.fr/viewtopic.php?pid=8214#p8214 if you read
french). I assume it would work well with lc_messages set to C.
I set it to 'en_US' which has the same effect.
Any production server should have lc_messages set to C.
I disagree. Any system that offers to write my messages in French should do so correctly, *especially* if it's the default.
In any case, it's a core PostgreSQL bug, not a PGAdmin bug. It could be dealt with a little more gracefully, though.
Le 23/02/2011 23:20, Vik Reykja a écrit : > On Wed, Feb 23, 2011 at 23:11, Guillaume Lelarge <guillaume@lelarge.info>wrote: > >> > default) on a French Windows. >>> >>> See this unresolved thread for more info: >>> http://archives.postgresql.org/pgsql-bugs/2010-09/msg00138.php >>> >> >>> I get it quite easily with LC_MESSAGES = 'French, France' (the >> installer's >> That's what the guy has (see >> http://forums.postgresql.fr/viewtopic.php?pid=8214#p8214 if you read >> french). I assume it would work well with lc_messages set to C. > > I set it to 'en_US' which has the same effect. > Yeah, that's right. I set it to C because it needs less typing :) >> Any production server should have lc_messages set to C. >> > > I disagree. Any system that offers to write my messages in French should do > so correctly, *especially* if it's the default. > Yeah. There are three main issues with translated messages: * Try searching anything on Google with french messages. You'll be lucky if you find something, and you'll get billionsof results in english. * Try asking something in the mailing lists with french messages. The first answer will be: get us the english messages. * Try using any log parser (like pgfouine) with french messages. It won't work (even with this tool, written by a frenchguy). I said french, but I suppose these issues are also a problem to other languages. Other meaning all but english. > In any case, it's a core PostgreSQL bug, not a PGAdmin bug. It could be > dealt with a little more gracefully, though. > Well, pgAdmin could read lc_messages value to guess if it can read it. That's probably all we can do. -- Guillaumehttp://www.postgresql.frhttp://dalibo.com
On Wed, Feb 23, 2011 at 23:37, Guillaume Lelarge <guillaume@lelarge.info> wrote:
Chicken and egg. The more we discourage localized messages, the less information there will be about them. Why even bother translating?
Even on the French mailing list? Again, I see this as a problem to be solved, not avoided.
So you've succeeded in making an argument for improving that tool :-)
>>>> Any production server should have lc_messages set to C.
>
> I disagree. Any system that offers to write my messages in French should do
> so correctly, *especially* if it's the default.
>
Yeah. There are three main issues with translated messages:
* Try searching anything on Google with french messages. You'll be
lucky if you find something, and you'll get billions of results in
english.
Chicken and egg. The more we discourage localized messages, the less information there will be about them. Why even bother translating?
* Try asking something in the mailing lists with french messages. The
first answer will be: get us the english messages.
Even on the French mailing list? Again, I see this as a problem to be solved, not avoided.
* Try using any log parser (like pgfouine) with french messages. It
won't work (even with this tool, written by a french guy).
So you've succeeded in making an argument for improving that tool :-)
Le 23/02/2011 23:57, Vik Reykja a écrit : > On Wed, Feb 23, 2011 at 23:37, Guillaume Lelarge <guillaume@lelarge.info>wrote: > >> >> >>> >>> I disagree. Any system that offers to write my messages in French should >> do >>> so correctly, *especially* if it's the default. >>> >> >>>> Any production server should have lc_messages set to C. >> Yeah. There are three main issues with translated messages: >> >> * Try searching anything on Google with french messages. You'll be >> lucky if you find something, and you'll get billions of results in >> english. >> > > Chicken and egg. The more we discourage localized messages, the less > information there will be about them. Why even bother translating? > For new users who want to try without having to deal with english message. >> * Try asking something in the mailing lists with french messages. The >> first answer will be: get us the english messages. >> > > Even on the French mailing list? Again, I see this as a problem to be > solved, not avoided. > No. But you don't have many hackers on the french mailing lists :) >> * Try using any log parser (like pgfouine) with french messages. It >> won't work (even with this tool, written by a french guy). >> > > So you've succeeded in making an argument for improving that tool :-) > Actually, no. The english messages don't change between minor releases. French messages do, and a lot. There won't be any easy way to add such a feature to pgFouine. -- Guillaumehttp://www.postgresql.frhttp://dalibo.com
Am 23.02.2011 um 21:41 schrieb Guillaume Lelarge: > Le 22/02/2011 21:58, Guillaume Lelarge a écrit : >> Le 16/02/2011 14:21, Maximilian Tyrtania a écrit : >>> Just found this in my log file: >>> >>> <postgres%2011-02-16 13:55:32 CET22021>ERROR: invalid byte sequence for encoding "UTF8": 0xe3bc64 >>> <postgres%2011-02-16 13:55:32 CET22021>STATEMENT: SELECT pg_file_read('pg_log/postgresql-2011-02-16_000000.log', 100000,50000) >>> >>> Still not sure what's going on there. Apparently the contents of the logfile are not valid UTF8 characters. Also, afteri clicked the message boxes away, the log files contents appear incomplete in the log viewer (a couple hours worth ofentries are simply missing). >> I suppose it stopped to process the rest of the file once it found an >> invalid UTF8 character. There's not much we can do about this. > > One guy on a french web forum has the same issue than you. Can you tell > me the value of your lc_messages parameter? It was set to "de_DE.UTF8". Changed it to C, which is fine with me. Seems to have fixed the problem. Thanks, Maximilian Tyrtania Software-Entwicklung Dessauer Str. 6-7 10969 Berlin http://www.contactking.de