Thread: Converting a DB from SQL_ASCII to UTF-8

Converting a DB from SQL_ASCII to UTF-8

From
Michael Monnerie
Date:
Dear list,

how would I convert an existing database (dbmail mail server,
http://dbmail.org/) from SQL_ASCII to UTF-8?

I cannot just dump and reload, because:

psql:/mnt/dbmail:191030: FEHLER:  ungültige Byte-Sequenz für
Kodierung »UTF8«: 0xdc62
TIP:  Dieser Fehler kann auch auftreten, wenn die Bytesequenz nicht mit
der Kodierung übereinstimmt, die der Server erwartet, welche
durch »client_encoding« bestimmt wird.

(How could I tell postfix to speak with me in english?)

Now what would be the correct way to convert, without destroying any of
the DB contents? There are e-mails there, so binary values are
important, too.

mfg zmi
--
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0676/846 914 666                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: EA39 8918 EDFF 0A68 ACFB  11B7 BA2D 060F 1C6F E6B0
// Keyserver: www.keyserver.net                   Key-ID: 1C6FE6B0

Attachment

Re: Converting a DB from SQL_ASCII to UTF-8

From
Ivo Rossacher
Date:
There is no general answer to the question. It highly depends on how the data
was entered.
SQL_ASCII is not an encoding but more the absence of any encoding. So whatever
the client sends will be put into the database. So if you entered data with
different encodings from the client side (typically different clients on
different operating systems). What to do with the data depends highly on the
used encodings of the clients. If you are very lucky and all entries were
based on the same encoding you might run iconv or recode on the dump and
reload the corrected dump. Otherwise you will have to manually, with the help
of recode or iconv, adjust the dump.
If your server is once on UTF-8 the server will convert from and to the
encoding of your client for you. SQL_ASCII asks the server not to convert
anything.

Best regards
Ivo
Am Montag, 18. Juni 2007 16:36 schrieb Michael Monnerie:
> Dear list,
>
> how would I convert an existing database (dbmail mail server,
> http://dbmail.org/) from SQL_ASCII to UTF-8?
>
> I cannot just dump and reload, because:
>
> psql:/mnt/dbmail:191030: FEHLER:  ungültige Byte-Sequenz für
> Kodierung »UTF8«: 0xdc62
> TIP:  Dieser Fehler kann auch auftreten, wenn die Bytesequenz nicht mit
> der Kodierung übereinstimmt, die der Server erwartet, welche
> durch »client_encoding« bestimmt wird.
>
> (How could I tell postfix to speak with me in english?)
>
> Now what would be the correct way to convert, without destroying any of
> the DB contents? There are e-mails there, so binary values are
> important, too.
>
> mfg zmi