Re: Character Decoding Problems - Mailing list pgsql-jdbc
From | Evan Tsue |
---|---|
Subject | Re: Character Decoding Problems |
Date | |
Msg-id | 40249E56-CD41-11D7-A787-000A95A08104@windsormgmt.com Whole thread Raw |
In response to | Re: Character Decoding Problems ("zy7111" <zy7111@mail.china.com>) |
Responses |
Re: Character Decoding Problems
|
List | pgsql-jdbc |
Ok, I've sat down with the problem a little bit more. It now seems to me that the decodeUTF8 method is doing the encoding correctly. It places the result from translating from UTF-8 to UTF-16 in the char[] l_cdata variable. It then creates a new String by calling new String(l_cdata, 0, j) I believe that the variable j is the length of the filled in portion of the l_cdata array. l_cdata is a class variable that is reused between method calls (the decodeUTF8 method is synchronized). This seems to be the problem. I haven't figured out why yet. I also have the same problem when running on FreeBSD (using the FreeBSD 1.4 JVM). Evan On Tuesday, Aug 12, 2003, at 21:28 US/Eastern, zy7111 wrote: > I use pg73jdbc3.jar as JDBC driver. It works fine. > >> Yes, it should work in 7.2.2. The decodeUTF8 method wasn't introduced >> until later. From the comments in the code, it seems that the reason >> for its inclusion was for performance. >> >> Evan >> >> On Tuesday, Aug 12, 2003, at 08:34 US/Eastern, <zy7111@mail.china.com> >> wrote: >> >>> I can insert and retrieve chinese into postgresql 7.2.2 successfully. >>> Both operation through JDBC. >>> It seems you insert text using psql and retrieve using JDBC. >>> >>> ----- Original Message ----- >>> From: "Evan Tsue" <evan@windsormgmt.com> >>> To: <pgsql-jdbc@postgresql.org> >>> Sent: Tuesday, August 12, 2003 1:38 PM >>> Subject: [JDBC] Character Decoding Problems >>> >>> >>>> Hi, >>>> >>>> I've been having problems decoding non-Latin characters using the >>>> Postgres JDBC driver. Here's the situation: I'm using postgres >>>> 7.3.2 >>>> and I've created a test database using 'createdb -E UNICODE testdb' >>>> to >>>> ensure that I really am using the UNICODE character set. Using >>>> psql, >>>> I >>>> created a table using the following command: 'CREATE TABLE messages >>>> (message_uid SERIAL PRIMARY KEY, message_text VARCHAR(255))' to test >>>> character encoding and decoding. At that point, I inserted a >>>> message >>>> that was in English. I also inserted a message that was in Arabic. >>>> I >>>> did a select on that table using psql and the values came back >>>> perfectly (I'm using MacOS X, so the characters are displayed >>>> correctly). >>>> Next, I did a select on the same table via JDBC. All I had the >>>> program do was select on the table and print the results out to >>>> standard output. The message in English was displayed perfectly. >>>> However, the message that was in Arabic was displayed as a series of >>>> question marks and spaces. >>>> I eventually navigated my way through the JDBC driver source to find >>>> that the problem is in the decodeUTF8 method in the >>>> org.postgresql.core.Encoding class. Apparently, it doesn't seem to >>>> be >>>> working properly for non-Western characters. I replaced the call to >>>> that method with a call to the java.lang.String constructor and now >>>> everything works perfectly. >>>> In addition to Arabic, I took a random sample of Chinese, Japanese, >>>> Russian and Korean text and inserted it into the database. Using >>>> the >>>> original driver, I get the question marks. But, when I used the >>>> String >>>> constructor, everything comes out fine. >>>> Could someone please either fix the Encoding.decodeUTF8 method or >>>> replace the call to that with a call to the String constructor? >>>> >>>> Thanks, >>>> Evan >>>> >>>> >>>> ---------------------------(end of >>>> broadcast)--------------------------- >>>> TIP 8: explain analyze is your friend >>>> >>> >>> ---------------------------(end of >>> broadcast)--------------------------- >>> TIP 8: explain analyze is your friend >> >> >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 2: you can get off all lists at once with the unregister command >> (send "unregister YourEmailAddressHere" to >> majordomo@postgresql.org) > ---------------------------------------------------------------------- > ÎÒ´æÔÚ£¬ÒòΪÎÒÊÇÖйúÈË,¾´Çë¹Ø×¢ÖлªÍøÐÅÌìÓÊ! > ÐÅÌìÓÊÖ®ÊÕ·ÑÓÊ http://paymail.china.com > ÐÅÌìÓÊÖ®Ãâ·ÑÓÊ http://mail.china.com > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to > majordomo@postgresql.org >
pgsql-jdbc by date: