Re: Encoding nightmare! Pls help! - Mailing list pgsql-jdbc

From John Sidney-Woollett
Subject Re: Encoding nightmare! Pls help!
Date
Msg-id 2625.192.168.0.64.1075755002.squirrel@mercury.wardbrook.com
Whole thread Raw
In response to Re: Encoding nightmare! Pls help!  (Barry Lind <blind@xythos.com>)
List pgsql-jdbc
Thanks for your reply - but I'm still totally baffled and confused...

I've got JSP pages printing tést.jpg as tést.jpg even though I have set
the content type to "text/html", 'UTF-8'.

Also, I've got servlets writing garbage into the database even though I
have explicitly set the encoding on the request object to UTF-8 - but this
may because I don't really know what encoding scheme the browser has used.

It's been a long day...

Thanks for your help.

John

Barry Lind said:
> John,
>
> My first guess is that your code is working fine, it is just your
> System.out.print() calls that are the problem.  You haven't specified
> what character set to use when printing, so it will use the default
> character set for your jvm.  This is necessary since java strings are
> stored internally in ucs2 and java needs to convert to some other
> character set when printing them out.
>
> thanks,
> --Barry
>
> John Sidney-Woollett wrote:
>> I've had a discussion on the general list about the implications for
>> storing accented characters within a postgres (7.4.1) db.
>>
>> As a result, I have created a database with no locale = C locale (using
>> initdb {other parms} --no-locale)
>>
>> Here's the database (test) with UNICODE encoding
>>
>>          List of databases
>>      Name     |  Owner   | Encoding
>> --------------+----------+----------
>>  test         | postgres | UNICODE
>>  template0    | postgres | UNICODE
>>  template1    | postgres | UNICODE
>>
>> The problem I'm having is that I CANNOT write accented characters into
>> the
>> database or get them out correctly using my java code and the
>> pg74.1jdbc3.jar file.
>>
>> Using psql, I select the data (with client encoding = UNICODE), and I
>> get
>>
>> tést.jpg
>>
>> With client encoding = LATIN1, I get
>>
>> tést.jpg
>>
>> But in my little java test app, I get:
>>
>> tést.jpg, tést.jpg, tést.jpg,
>>
>> I want tést.jpg!!!!!
>>
>> Here is the offending section of code:
>>
>> String filename = rset.getString(2);
>> System.out.print(filename);
>> System.out.print(", ");
>>
>> if (filename != null)
>> {
>>   try
>>   {
>>     filename = new String(rset.getBytes(2), "UTF-8");
>>   }
>>   catch (UnsupportedEncodingException e)
>>   {
>>     System.out.println("Cannot decode string?");
>>   }
>>
>>   System.out.print(filename);
>>   System.out.print(", ");
>>
>>   try
>>   {
>>     filename = new String(rset.getBytes(2), "ISO-8859-1");
>>   }
>>   catch (UnsupportedEncodingException e)
>>   {
>>     System.out.println("Cannot decode string?");
>>   }
>>
>>   System.out.print(filename);
>>   System.out.print(", ");
>> }
>>
>> Can anyone explain what I am doing wrong? I have become so confused by
>> all
>> this, that I don't think I can see the problem straight anymore.
>>
>> How can I read and write unicode chars into the db. Is there some magick
>> parameter that needs to be passed when setting up the connection/driver?
>>
>> Thanks for any/all help!!
>>
>> John Sidney-Woollett
>>
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: if posting/reading through Usenet, please send an appropriate
>>       subscribe-nomail command to majordomo@postgresql.org so that your
>>       message can get through to the mailing list cleanly
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>


pgsql-jdbc by date:

Previous
From: "John Sidney-Woollett"
Date:
Subject: Re: Encoding nightmare! Pls help!
Next
From: "David Wall"
Date:
Subject: OFF TOPIC: email postage should be of interest to those who use OSS newslists