Thread: DB Encoding question

DB Encoding question

From
Sbob
Date:
All;


We are converting from DB2 on the mainframe to PostgreSQL 14 in the 
cloud. We have been using IIDR to dump DB2 tables to CSV files and then 
using pg_loader / COPY to import the files. However the DB2 Encoding is 
EBCDIC and the PostgreSQL db encoding is UTF8, this is causing some rows 
/ columns to fail when we try to load the data into PostgreSQL


Will we be better off changing the PostgreSQL encoding to match the DB2 
database? Will this cause other issues down the road? Is there a 'best 
practice' for this use case?


Thanks in advance





Re: DB Encoding question

From
Bruce Momjian
Date:
On Mon, Aug 22, 2022 at 10:20:41AM -0600, Sbob wrote:
> All;
> 
> 
> We are converting from DB2 on the mainframe to PostgreSQL 14 in the cloud.
> We have been using IIDR to dump DB2 tables to CSV files and then using
> pg_loader / COPY to import the files. However the DB2 Encoding is EBCDIC and
> the PostgreSQL db encoding is UTF8, this is causing some rows / columns to
> fail when we try to load the data into PostgreSQL
> 
> Will we be better off changing the PostgreSQL encoding to match the DB2
> database? Will this cause other issues down the road? Is there a 'best
> practice' for this use case?

I would convert the dump file to be UTF8 using iconv and then load it. 
I would also suggest setting the _client_ encoding to EBCDIC and have
the server encoding be UTF8, but we don't support EBCDIC as far as I can
tell.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson




Re: DB Encoding question

From
Tom Lane
Date:
Sbob <sbob@quadratum-braccas.com> writes:
> We are converting from DB2 on the mainframe to PostgreSQL 14 in the 
> cloud. We have been using IIDR to dump DB2 tables to CSV files and then 
> using pg_loader / COPY to import the files. However the DB2 Encoding is 
> EBCDIC and the PostgreSQL db encoding is UTF8, this is causing some rows 
> / columns to fail when we try to load the data into PostgreSQL

You're going to need to run the data through an encoding conversion,
then.

> Will we be better off changing the PostgreSQL encoding to match the DB2 
> database?

No, because Postgres doesn't support EBCDIC encoding.

            regards, tom lane