Re: JDBC to load UTF8@psql to latin1@mysql - Mailing list pgsql-general

From Emi Lu
Subject Re: JDBC to load UTF8@psql to latin1@mysql
Date
Msg-id 50CB71DF.5060004@encs.concordia.ca
Whole thread Raw
In response to Re: JDBC to load UTF8@psql to latin1@mysql  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: JDBC to load UTF8@psql to latin1@mysql  (Emi Lu <emilu@encs.concordia.ca>)
List pgsql-general
Hello All,
>> Meh.  That character renders as \310 in your mail, which is not an
>> assigned code in ISO 8859-1.  The numerically corresponding Unicode
>> value would be U+0090, which is an unspecified control character.
>
> Oh, scratch that, apparently I can't do hex/octal arithmetic in my
> head first thing in the morning.  It's really U+00C8 which is perfectly
> valid.  I can't see a reason why that character and only that character
> would be problematic --- have you done systematic testing to confirm
> that that's the only should-be-LATIN1 character that fails?

Finally, the problem is resolved:

SHOW VARIABLES LIKE "character\_set\_%";
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | latin1 |
| character_set_connection | latin1 |
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results    | latin1 |
| character_set_server     | latin1 |
| character_set_system     | utf8   | -- here mysql uses utf8 for
character_set_system.

Change my java code to:
========================
public static String utf8_to_mysql(String str)
    throws Exception
    {
       try
       {
          byte[] convertStringToByte = str.getBytes("UTF-8");
          str                        = new String(convertStringToByte,
"UTF-8");
          return str;
       }catch(Exception e)
       {
          log.error("utf8_to_latin1 Error: " + e.getMessage());
          log.error(e);
          throw e;
       }

Have to explicitly specify "UTF-8", but cannot leave as empty.

Larry's comments(from MyBatis mailing list) and I tried both "from/to"
by "UTF8". It works. This is still little bit strange to me. But it works!

 >> My guess is that it's correct but the client you're using is messing
 >> it up. If not, then you need to look at your connection strings to
 >> the 2 databases to make sure they are handling the encodings
 >> correctly.Unless you set them specifically, I suspect they are using
 >> your default system encoding - so both may be using utf8 or iso8859.

Thank you very much for all of your help for this!
Emi



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: initdb error
Next
From: Alvaro Herrera
Date:
Subject: Re: Read recover rows