Re: MSSQL to PostgreSQL : Encoding problem - Mailing list pgsql-general

From Richard Huxton
Subject Re: MSSQL to PostgreSQL : Encoding problem
Date
Msg-id 4563A827.8080809@archonet.com
Whole thread Raw
In response to Re: MSSQL to PostgreSQL : Encoding problem  (Tony Caduto <tony_caduto@amsoftwaredesign.com>)
Responses Re: MSSQL to PostgreSQL : Encoding problem  ("Magnus Hagander" <mha@sollentuna.net>)
List pgsql-general
Tony Caduto wrote:
> Arnaud Lesauvage wrote:
>>
>>
>>> I then try to import into PostgreSQL. The farther I can get is when
>>> using the UNICODE export, and importing it using a client_encoding
>>> set to UTF8 (I tried WIN1252, LATIN9, LATIN1, ...).
>>> The copy then stops with an error :
>>> ERROR: invalid byte sequence for encoding "UTF8": 0xff
>>> État SQL :22021
>>>
>>> The problematic character is the euro currency symbol.
>>
>>
> Exporting from MS SQL server as unicode is going to give you full
> Unicode, not UTF8.  Full unicde is 2 bytes per character and UTF8 is 1,
> same as ASCII.
> You will have to encode the Unicode data to UTF8

Well, UTF8 is a minimum of one byte, but can be longer for non-ASCII
characters. The idea being that chars below 128 map to ASCII. There's
also UTF16 and I believe UTF32 with 2+ and four byte characters.

> I have done this in Delphi using it's built in UTF8 encoding and
> decoding routines.   You can get a free copy of Delphi Turbo Explorer
> which includes components for MS SQL server and ODBC, so it would be
> pretty straight forward to get this working.
>
> The actual method in Delphi is system.UTF8Encode(widestring).  This will
> encode unicode to UTF8 which is compatible with a Postgresql UTF8 database.

Ah, that's useful to know. Windows just doesn't have the same quantity
of tools installed as a *nix platform.

> I am sure Perl could do it also.

And in one line if you're clever enough no doubt ;-)

--
   Richard Huxton
   Archonet Ltd


pgsql-general by date:

Previous
From: Tony Caduto
Date:
Subject: Re: MSSQL to PostgreSQL : Encoding problem
Next
From: Bill Kurland
Date:
Subject: Upgrade problem