Re: JDBC encoding problem - Mailing list pgsql-jdbc

From Anders Hermansen
Subject Re: JDBC encoding problem
Date
Msg-id 20030805180020.GA22193@online.no
Whole thread Raw
In response to JDBC encoding problem  (Kurt Overberg <kurt@hotdogrecords.com>)
List pgsql-jdbc
* Kurt Overberg (kurt@hotdogrecords.com) wrote:
> I'm having a rather strange problem that I'm hoping someone can help me
> with.  I'm using Struts 1.0/jsp on Debian linux under Tomcat 4.1.x and
> the blackdown JVM  .  I'm attempting to convert my current SQL_ASCII
> database to UNICODE.  I'm new to this, so am most likely making a few
> mistakes.  Here's what I've done so far:
>
> o  Converted database encoding to be UNICODE.  I'm pretty sure this part
> worked okay.  (did a pg_dump, then iconv -f 8859_1 -t UTF-8, then
> created new db with encoding UNICODE and reloaded- no errors upon reload)
>
> sparky:~$ psql -l
>         List of databases
>    Name    |  Owner   | Encoding
> -----------+----------+-----------
>  unitest   | kurt     | UNICODE
>  template1 | postgres | SQL_ASCII
> (2 rows)

Ok.

> o  set client_encoding to 'UTF8';

As I read in another thread, client_encoding does not matter for the
JDBC driver. It will change it to UNICODE when you connect. It probably
does this because all java strings are unicode.

But it will probably matter for your psql connections, if any.

> o  In my JSP files, I set the following at the top of each:
>
> <%@ page lanuage="java" pageEncoding="UTF-8" %>

Try to change this to

<%@ page lanuage="java" pageEncoding="UTF-8" contentType="text/html;charset=UTF-8" %>

> Now, to test this, I go to a japanese page, copy some text, then paste
> it into a form, that gets submitted to the server and saved into the DB.
> Then I try to display what I got back from the database.  It comes out
> garbled.  HOWEVER- if I leave the 'pageEncoding' out of my display .jsp
> file it still comes out garbled, UNTIL I set UTF-8 manually in my
> browsers Character Encoding settings (both mozilla and IE).  Then the
> japanese characters render fine (just like I entered them).
>
> Very strange.  What's confusing is that when I set the pageEncoding to
> 'UTF-8', the characters don't render properly, and as far as I can tell,
> thats the same as manually setting the browser manually.  I must be
> doing something wrong because I get the same results in IE and mozilla
> (recent build).
>
> What may be the problem- I don't do anything differently when getting
> the data out of the database, just standard
> resultset.getString("column");  Do I need to change that call, to handle
> the potentially UTF-8 encoded strings?  I can't find anything on that at
> all with google/usenet.

Have you tried putting unicode characters inside the db using pgsql? And
then showing them using the web-tier?

I have used tomcat as the webapp-server for many applications, and it
defaults to ISO-8859-1 character set for POST forms. Strange it is.
You can change this by calling
request.setCharacterEncoding("UTF-8");
before you get any data from your form.

Maybe the pageEncoding="UTF-8" changes this? I have not used that option
before.

Please check again that the data that you put in the database using JDBC
is not garbage due to characterset conversion.

> Any and all help, suggestions or pointers would be greatly appreciated.


I hope this helps,
Anders

--
Anders Hermansen
YoYo Mobile as

pgsql-jdbc by date:

Previous
From: "Philip A. Chapman"
Date:
Subject: Connection.setCatalog method
Next
From: Fernando Nasser
Date:
Subject: Re: Connection.setCatalog method