Home > mailing lists

Re: invalid byte sequence for encoding "UTF8": 0xf1612220 - Mailing list pgsql-general

From	Craig Ringer
Subject	Re: invalid byte sequence for encoding "UTF8": 0xf1612220
Date	May 11, 2011 23:52:39
Msg-id	4DCB4B68.1070102@postnewspapers.com.au Whole thread Raw
In response to	invalid byte sequence for encoding "UTF8": 0xf1612220 (AI Rumman <rummandba@gmail.com>)
Responses	Re: invalid byte sequence for encoding "UTF8": 0xf1612220
List	pgsql-general

Tree view

On 05/11/2011 03:16 PM, AI Rumman wrote:
> I am trying to migrate a database from Postgresql 8.2 to Postgresql 8.3
> and getting the following error:
>
> pg_restore: [archiver (db)] Error from TOC entry 2764; 0 29708702 TABLE
> DATA originaldata postgres
> pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte sequence
> for encoding "UTF8": 0xf1612220
> HINT:  This error can also happen if the byte sequence does not match
> the encoding expected by the server, which is controlled by
> "client_encoding".
> CONTEXT:  COPY wi_originaldata, line 3592
>
> I took a dump from 8.2 server and then tried to restore at 8.3.
>
> Both the client_encoding and server_encoding are UTF8 at both the servers.

Newer versions of Pg got better at caching bad unicode. While this helps
prevent bad data getting into the database, it's a right pain if you're
moving data over from an older version with less strict checks.

I don't know of any way to relax the checks for the purpose of importing
dumps. You'll need to fix your dump files before loading them (by
finding the faulty text and fixing it) or fix it in the origin database
before migrating the data. Neither approach is nice or easy, but nobody
has yet stepped up to write a unicode verifier tool that checks old
databases' text fields against stricter rules...

--
Craig Ringer

pgsql-general by date:

From: Craig Ringer
Date: 11 May 2011, 23:48:28
Subject: Re: ERROR: cannot execute nextval() in a read-only transaction

From: Craig Ringer
Date: 12 May 2011, 00:08:33
Subject: Re: Sharing data between databases

Re: invalid byte sequence for encoding "UTF8": 0xf1612220 - Mailing list pgsql-general

Previous

Next