Thread: Idle Error invalid byte sequence

Idle Error invalid byte sequence

From
Randall Wilson
Date:
Hi all,
We've got a PostgreSQL 8.1.11 database that started displaying these
error messages almost immediately AFTER it was restored. It restored
fully and successfully. They occur every few seconds to every few
minutes. Here are the error messages.

[2008-09-18 10:18:14 CDT] idleERROR:  invalid byte sequence for encoding
"UTF8": 0x8b
[2008-09-18 10:18:14 CDT] idleHINT:  This error can also happen if the
byte sequence does not match the encoding expected by the server, which
is controlled by "client_encoding".

Is there any way to figure out where the 0x8b character is?

Attachment

Re: Idle Error invalid byte sequence

From
Tom Lane
Date:
Randall Wilson <rwilson@earthcomber.com> writes:
> We've got a PostgreSQL 8.1.11 database that started displaying these
> error messages almost immediately AFTER it was restored.
> ...
> Is there any way to figure out where the 0x8b character is?

Hmm, did you perhaps change the database encoding from its previous
value?  The most likely bet is that the bad encoding is arriving from a
client, which is evidently expecting to send something other than UTF8
but not bothering to set client_encoding to tell what.

            regards, tom lane

Re: Idle Error invalid byte sequence

From
Randall Wilson
Date:
Yes, the encoding was changed from SQL_ASCII to UTF-8. Does the fact
that it's an "Idle" error conform your theory that some client is
causing the error? I thought the "Idle" error was caused by errors found
during background processes, like vacuums.

I made a plain text backup of the live database and I'm not having any
luck grepping for the character. I may be doing it wrong. If I can't
find it in the database dump, what does that tell us? Is it possibly in
a template database.

Thanks for your help,
Randy

Tom Lane wrote:
> Randall Wilson <rwilson@earthcomber.com> writes:
>
>> We've got a PostgreSQL 8.1.11 database that started displaying these
>> error messages almost immediately AFTER it was restored.
>> ...
>> Is there any way to figure out where the 0x8b character is?
>>
>
> Hmm, did you perhaps change the database encoding from its previous
> value?  The most likely bet is that the bad encoding is arriving from a
> client, which is evidently expecting to send something other than UTF8
> but not bothering to set client_encoding to tell what.
>
>             regards, tom lane
>
>

Attachment

Re: Idle Error invalid byte sequence

From
Tom Lane
Date:
Randall Wilson <rwilson@earthcomber.com> writes:
> Yes, the encoding was changed from SQL_ASCII to UTF-8. Does the fact
> that it's an "Idle" error conform your theory that some client is
> causing the error? I thought the "Idle" error was caused by errors found
> during background processes, like vacuums.

You'd have to tell us what your log_line_prefix is before we could
interpret that, but yeah, an encoding error in an incoming query string
would be reported before the backend could even consider going non-idle.

A SQL_ASCII database would have accepted any data whatsoever.  If you
want to change to something where you actually know the encoding, that's
fine, but realize that you're going to have to make your clients be
honest about what encoding they're using, too.

            regards, tom lane

Re: Idle Error invalid byte sequence

From
Randall Wilson
Date:
log_line_prefix = '[%t] %i'
So, "idle" is just the "command tag", which probably means it's not
associated with a command.

Unless you think grepping my plain text backup is a waste of time,
you've answered all my questions (Thanks) and I don't need a reply.
Thanks again,
Randy Wilson

Tom Lane wrote:
> Randall Wilson <rwilson@earthcomber.com> writes:
>
>> Yes, the encoding was changed from SQL_ASCII to UTF-8. Does the fact
>> that it's an "Idle" error conform your theory that some client is
>> causing the error? I thought the "Idle" error was caused by errors found
>> during background processes, like vacuums.
>>
>
> You'd have to tell us what your log_line_prefix is before we could
> interpret that, but yeah, an encoding error in an incoming query string
> would be reported before the backend could even consider going non-idle.
>
> A SQL_ASCII database would have accepted any data whatsoever.  If you
> want to change to something where you actually know the encoding, that's
> fine, but realize that you're going to have to make your clients be
> honest about what encoding they're using, too.
>
>             regards, tom lane
>
>

Attachment

Re: Idle Error invalid byte sequence

From
Tom Lane
Date:
Randall Wilson <rwilson@earthcomber.com> writes:
> log_line_prefix = '[%t] %i'
> So, "idle" is just the "command tag", which probably means it's not
> associated with a command.

No, you missed my point: if an incoming command contains invalidly
encoded characters, the error about that will be thrown *before* the
command tag can be changed from "idle".  I think you are looking in the
wrong place entirely --- the bad data is not in the database, it's
coming from a client.

            regards, tom lane