Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92 - Mailing list pgsql-admin

From Tom Lane
Subject Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92
Date
Msg-id 20880.1439306294@sss.pgh.pa.us
Whole thread Raw
In response to Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92  (Prasanth Reddy <dbadmin@nqadmin.com>)
List pgsql-admin
Prasanth Reddy <dbadmin@nqadmin.com> writes:
> I am currently running 9.1.9 and trying to upgrade to 9.4. I have done a dump and restore, when I start my java
applicationI am getting the below error. The server uses SQL_ASCII encoding and the 
> client encoding is UTF8. There are some invalid characters in the database but this has not caused a problem in the
currentversion or 9.3 (tried a restore in 9.3 and the application works fine). 

>  ERROR:  invalid byte sequence for encoding "UTF8": 0x92
>  STATEMENT:  SELECT * FROM client_data WHERE status_code = 0 ORDER BY client_name, description

You need to fix the encoding errors in your data.  9.4 is intentionally
less lax about that than prior versions.

Or, if you really want the database to be totally encoding-ignorant,
use SQL_ASCII as both client and server encoding.  But if you have the
client declared to use UTF8, the server will try not to send anything
that isn't valid UTF8.

I believe the specific change that's biting you is

    Author: Tom Lane <tgl@sss.pgh.pa.us>
    Branch: master Release: REL9_4_BR [49c817eab] 2014-02-23 15:22:50 -0500

    Plug some more holes in encoding conversion.

    Various places assume that pg_do_encoding_conversion() and
    pg_server_to_any() will ensure encoding validity of their results;
    but they failed to do so in the case that the source encoding is SQL_ASCII
    while the destination is not.  We cannot perform any actual "conversion"
    in that scenario, but we should still validate the string according to the
    destination encoding.  Per bug #9210 from Digoal Zhou.

but there were some others of the same ilk in 9.4.

            regards, tom lane


pgsql-admin by date:

Previous
From: Scott Ribe
Date:
Subject: Re: Postgresql 9.4.4 - ERROR: invalid byte sequence for encoding "UTF8": 0x92
Next
From: John Scalia
Date:
Subject: Re: Postgresql-9.1 CentOS7 effective_cache_size issue