Thread: Converting SQL-ASCII encoding database to UNICODE

Converting SQL-ASCII encoding database to UNICODE

Rajesh Kumar Mallah
Hi ,

my current database in 7.3.4 is SQL_ASCII  are there
any benifits of coverting it to UNICODE encoding
in 7.4 ? If so what is the process.


Re: Converting SQL-ASCII encoding database to UNICODE

Jean-Michel POURE
Le Dimanche 9 Novembre 2003 19:39, Rajesh Kumar Mallah a écrit :
> If so what is the process

The advantage of using a Unicode database is that UTF-8 supports/includes all
known encodings at once. Therefore, in the process of development, it can
help you save time.

When using a Unicode database, if the client does not support Unicode, it is
always possible to recode data streams on the fly with "set client_encoding =
foo_encoding". Therefore, there is no real drawback in using Unicode. It may
only be a little slower, but there is no real proof.

The process of conversion is as follows:

- dump the whole database using pg_dump:
pg_dump database_name > pg_data_dump.sql

- Do no drop the old database. Make a backup of the dump:
cp pg_data_dump.sql pg_data_dump.sql.bak

- recode the dump using the GNU recode package:
recode ..u8 pg_data_dump.sql

recode will try to guess the encoding the original ASCII database.

- Open the file in an UTF-8 editor like Kate and verify that all characters
are preserved and display well (in Hindi?). If it does not work, it may be a
problem of original encoding. Try to guess it and retry using:

cp -f source_encoding..u8.bak source_encoding..u8
recode source_encoding..u8 pg_data_dump.sql

- create an empty Unicode database:
psql template1
create database new_database with encoding=Unicode;
reload the dump: psql new_database < pg_data_dump.sql

GNU/recode is available in most GNU/Linux distributions.

By the way, as you live in India, do not hesitate to visit if you can help us.

Most translations in languages for India are "stalled", do not hesitate to
take over the translation in Hindi for example.
