Thread: PostgreSQL JDBC Driver versus Encoding
Hello,
I have a PostgreSQL database which encoding is SQL_ASCII.
I inserted information into that database from a MySQL database, using Java code, which encoding is UTF-8, and the information in the destination database has some strange characters related with encoding problems.
There is any additional parameter that we can consider in the PostrgreSQL connection string to deal with those problems?
The problem should be related with the encoding used to connect to MySQL database?
Best regards,
João Paulo Pires
Developer
Attachment
João Paulo Pires wrote: > I have a PostgreSQL database which encoding is SQL_ASCII. This is the cause of your problem. SQL_ASCII provides essentially no encoding information at all, so the JDBC driver does not know how to translate between the database and Java's internal UTF-16 representation. If you are going to be inserting anything other than 7-bit-ASCII text into it, you are going to have problems. If you can, try using a database encoding that can represent the data you are inserting. UTF8 / UNICODE is probably a good choice unless you have special requirements. See http://www.postgresql.org/docs/8.2/static/multibyte.html especially the bit that says: > The SQL_ASCII setting behaves considerably differently from the other settings. When the server character set is SQL_ASCII,the server interprets byte values 0-127 according to the ASCII standard, while byte values 128-255 are taken asuninterpreted characters. No encoding conversion will be done when the setting is SQL_ASCII. Thus, this setting is notso much a declaration that a specific encoding is in use, as a declaration of ignorance about the encoding. In most cases,if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting, because PostgreSQL will be unableto help you by converting or validating non-ASCII characters. -O
João Paulo Pires wrote: > > Hello, > > > > I have a PostgreSQL database which encoding is SQL_ASCII. > > I inserted information into that database from a MySQL database, using > Java code, which encoding is UTF-8, and the information in the > destination database has some strange characters related with encoding > problems. > > > > There is any additional parameter that we can consider in the > PostrgreSQL connection string to deal with those problems? > > The problem should be related with the encoding used to connect to > MySQL database? > > > you can't encode all symbols in UTF-8 with simple ASCII. Symbols that are the same will translate but ASCII only has 200 or so glyphs. you'll need to convert the database to UTF-8, then redo your inserts.