Re: UTF-8 data migration problem in Postgresql 7.2 - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: UTF-8 data migration problem in Postgresql 7.2
Date
Msg-id 20020220095055B.t-ishii@sra.co.jp
Whole thread Raw
In response to UTF-8 data migration problem in Postgresql 7.2  (Jean-Michel POURE <jm.poure@freesurf.fr>)
Responses Re: UTF-8 data migration problem in Postgresql 7.2  (Jean-Michel POURE <jm.poure@freesurf.fr>)
List pgsql-hackers
> I tried to migrate a UTF-8 database with Japanese content from Postgresql
> 7.1.3 to Postgresql 7.2 unsucessfully.
>
> I made a pg_dump of the 7.1.3 database and tried to upload it in 7.2.
> Error message was "Invalid Unicode character sequence found".
>
> I tried to upload data in a fresh PostgreSQL 7.1.3 database and everything
> worked OK. The problem can be reproduced on PostgreSQL 7.2.
>
> Do not hesitate to ask me for a pg_dump of the involved tables. I cannot
> submit them on the list for privacy reasons. Thanks for your help.

I have received the test data and got:

psql:dump_error.sql:40: ERROR:  copy: line 1, Invalid UNICODE character sequence found (0xe38227)

The error messages shows your data contains an invalid UTF-8 character
sequence. The first byte (e8) expects the UTF-8 sequence consists of 3
bytes, each of 8th bit on. However the third byte is 0x27, apparently
it does not satisfy the requirement. Check your data.
--
Tatsuo Ishii

pgsql-hackers by date:

Previous
From: Philip Warner
Date:
Subject: Re: pg_dump: use ALTER TABLE for PKs
Next
From: "Christopher Kings-Lynne"
Date:
Subject: Re: alter table drop column status