Thread: Invalid byte sequence for encoding "UTF8"

Invalid byte sequence for encoding "UTF8"

From
"Neil Saunders"
Date:
Hi All,

I'm at temping to migrate a database from a Windows installation
(8.1.4) to a Linux installation (8.2.4). I've successfully migrated
the schema, but now am running in to difficulties importing the data.

For manageability, I've separated the schema from the reference data,
which is dumped separately with the following command:

pg_dump --encoding=UTF8 --username=postgres
--format=c -f c:\tvp_data\currencies.data --data-only --table=currencies tvclive

Due to short sightedness, the Windows database encoding is SQL_ASCII.
The production database is UTF8. This produced an error, so I tried
using the --encoding=UTF8 on the pg_dump command line, but to no
avail:

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2279; 0 253991651
TABLE DATA currencies postgres
pg_restore: [archiver (db)] COPY failed: ERROR:  invalid byte sequence
for encoding "UTF8": 0xe76169
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".
CONTEXT:  COPY currencies, line 159
WARNING: errors ignored on restore: 1

I'd appreciate any assistance anyone could offer me with this - It's
driving me nuts!

Cheers,

Neil.

Re: Invalid byte sequence for encoding "UTF8"

From
John DeSoi
Date:
You'll probably need to dump the database as SQL_ASCII (in text
format) and then use a program like iconv (http://en.wikipedia.org/
wiki/Iconv) to fix it so that it is valid UTF8.


On May 14, 2007, at 5:23 PM, Neil Saunders wrote:

> I'm at temping to migrate a database from a Windows installation
> (8.1.4) to a Linux installation (8.2.4). I've successfully migrated
> the schema, but now am running in to difficulties importing the data.



John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


Re: Invalid byte sequence for encoding "UTF8"

From
"Neil Saunders"
Date:
Excellent - Thanks John - I'll give that a crack.

Any idea why the --encoding flag didn't do this via pg_dump?

Thanks again,

Neil.

On 5/15/07, John DeSoi <desoi@pgedit.com> wrote:
> You'll probably need to dump the database as SQL_ASCII (in text
> format) and then use a program like iconv (http://en.wikipedia.org/
> wiki/Iconv) to fix it so that it is valid UTF8.
>
>
> On May 14, 2007, at 5:23 PM, Neil Saunders wrote:
>
> > I'm at temping to migrate a database from a Windows installation
> > (8.1.4) to a Linux installation (8.2.4). I've successfully migrated
> > the schema, but now am running in to difficulties importing the data.
>
>
>
> John DeSoi, Ph.D.
> http://pgedit.com/
> Power Tools for PostgreSQL
>
>

Re: Invalid byte sequence for encoding "UTF8"

From
John DeSoi
Date:
Hi Neil,

On May 15, 2007, at 5:51 PM, Neil Saunders wrote:

> Any idea why the --encoding flag didn't do this via pg_dump?


I'm not really sure. I supposed for one thing it might be difficult
to do the right thing going from SQL_ASCII to UTF8.

I had some issues with 8.0 -> 8.1 that I was able to fix with
something like this before dumping it:

update my_table set comments = translate(comments,  chr(169), 'c')
where comments like '%' || chr(169) || '%';



John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL