Thank you Devrim,
Am I correct in saying then that UTF-8 and UNICODE are the same thing as far
as PostgreSQL is concerned?
Best regards,
James Dey
tel +27 11 704-1945
cell +27 82 785-5102
fax +27 11 388-8907
mail james@mygus.com
-----Original Message-----
From: Devrim GUNDUZ [mailto:devrim@commandprompt.com]
Sent: 10 February 2006 10:45 AM
To: James Dey
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Migration
Hi,
On Fri, 2006-02-10 at 10:38 +0200, James Dey wrote:
> I have installed PostgreSQL 8.1 onto a new box, and when trying to
> create a new database, even when specifiying -encoding=UNICODE I
> cannot create UNICODE dbs, it makes them all UTF-8.
>
>
> Then when I try and restore a DB dump, but keep on getting invalid
> "UTF-8 byte sequence detected near byte xxx" and although the restore
> continues, a lot of data is missing.
http://developer.postgresql.org/docs/postgres/release-8-1.html
"Some users are having problems loading UTF-8 data into 8.1.X. This is
because previous versions allowed invalid UTF-8 byte sequences to be
entered into the database, and this release properly accepts only valid
UTF-8 sequences. One way to correct a dumpfile is to run the command
iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql. The -c option
removes invalid character sequences. A diff of the two files will show
the sequences that are invalid. iconv reads the entire input file into
memory so it might be necessary to use split to break up the dump into
multiple smaller files for processing."
Regards,
--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: PL/php, plPerlNG - http://www.commandprompt.com/