Re: Invalid byte sequence when importing Oracle BLOB - Mailing list pgsql-general

From Laurenz Albe
Subject Re: Invalid byte sequence when importing Oracle BLOB
Date
Msg-id 3762f2cd801291ce2092138dcc8db01f64d60082.camel@cybertec.at
Whole thread Raw
In response to Invalid byte sequence when importing Oracle BLOB  (Ron <ronljohnsonjr@gmail.com>)
List pgsql-general
On Mon, 2021-04-26 at 06:49 -0500, Ron wrote:
> The source is an Oracle 12 db with this encoding:
> NLS_LANG = AMERICAN_AMERICA.AL32UTF8
> NLS_NCHAR = AL32UTF8
> 
> The destination is an (RDS) Postgresql 12.5 with encoding UTF8, and is being 
> loaded through COPY commands generated by ora2pg.
> 
> The source table has a BLOB column (I think they are scanned images) which 
> I'm loading into a Postgresql bytea column.
> 
> Seven times out of about 60M rows, I get this error:
> Psql:909242: ERROR:  invalid byte sequence for encoding "UTF8": 0xed 0xaf 0xbf
> 
> Is there anything I can change on the Postgresql side to make these records 
> import correctly?

The "bytea" column is probably a red herring - this error message should
be caused by a "text" or "varchar" or other string data column.

It is surprisingly easy to enter currupt strings into Oracle - just set
client encoding to the same value as server encoding, and it won't check
the integrity of your strings.

If that is your problem, you must identify and fix the data in Oracle.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




pgsql-general by date:

Previous
From: Ron
Date:
Subject: Re: Invalid byte sequence when importing Oracle BLOB
Next
From: luis.roberto@siscobra.com.br
Date:
Subject: DB size