Re: JDBC driver, PGSQL 7.3.2 and accents characters - Mailing list pgsql-jdbc

From Csaba Nagy
Subject Re: JDBC driver, PGSQL 7.3.2 and accents characters
Date
Msg-id 1048152062.1058.25.camel@coppola.ecircle.de
Whole thread Raw
In response to Re: JDBC driver, PGSQL 7.3.2 and accents characters  (Daniel Bruce Lynes <dlynes@shaw.ca>)
Responses Re: JDBC driver, PGSQL 7.3.2 and accents characters
List pgsql-jdbc
Your procedure makes absolutely no sense, as Strings are always stored
as Unicode in Java. So what you propose is basically this:
- you have a Unicode-encoded string in the first place;
- encode that string to the "text" byte array using "ISO-8859-1";
- read back the "ISO-8859-1"-encoded byte array to a Unicode String
interpreting the bytes using "UTF-8" encoding... which will more than
likely give you errors, because it is NOT "UTF-8".

HTH
Csaba.


On Thu, 2003-03-20 at 00:11, Daniel Bruce Lynes wrote:
> On Wednesday 19 March 2003 01:35, Davide Romanini wrote:
>
> > I've nice problems with the jdbc driver. I've tried with the jdbc2,
> > jdbc, latest stable and also development release.
> > I've a database in postgres with some varchar fields. The database is
> > SQL_ASCII as char encoding. In that varchar fields I've stored also
> > names with accents such è, à, ì etc... They work fine using the psql
> > program, and also linking tables to access through the odbc driver. But
> > when I try to use jdbc to connect to database my accents fail to load.
> > For example I have the string 'Forlì Sud'. When I try to
> > system.out.println this string catched by jdbc with rs.getString, I see
> > this string instead of the original one: 'Forl?ud'.
> > I've tried also to use different character sets in the connection url
> > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.
> >
> > Please help me, because this bug makes java and jdbc pretty unusable to
> > connect pgsql databases.
>
> I doubt very much it's a bug in pgsql.  It's probably more than likely a
> misunderstanding on your part about how character sets work in Java.
>
> I'm guessing Barry Lind didn't read the last part of your message, or he
> probably would've known what the problem was, as well.
>
> He is correct however, in stating that PostgreSQL probably will not allow you
> to save accented characters in a database with an encoding of SQL_ASCII.
> You'll need to use SQL_UNICODE(?) as the encoding, more than likely.
>
> Because your character set is iso-8859-1 however, you'll need to convert the
> strings to Unicode first, before saving to the database.
>
> You do this as follows:
>
>     byte[] text=myString.getBytes("iso-8859-1") ;
>     String myNewString=new String(text,"utf-8") ;
>     stmt.setString(x,myNewString) ;
>
> To get it back out, try the following:
>
>     String myString=rs.getString(x) ;
>     byte[] text=myString.getBytes("utf-8") ;
>     String myNewString=new String(text,"iso-8859-1") ;
>
> If you want your code to be portable, I should insist on you specifying the
> character set every time for getting bytes and creating strings.  The reason
> being is that different operating environments will have different default
> character sets.  For instance, in our office, I've got three default
> character sets.  On one Linux machine, it's ISO-8859-1, on another, it's
> GB2312-80, and on the Windows machines it's CP859(?).  The codepage in
> question on Windows is Microsoftese for ISO-8859-1/Latin 1/US ASCII with
> Latin A, depending on which standard you're used to.  It's also often
> referred to as CP437 (DOS and OS/2).
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



pgsql-jdbc by date:

Previous
From: Carlos Correia
Date:
Subject: Re: JDBC driver, PGSQL 7.3.2 and accents characters
Next
From: Csaba Nagy
Date:
Subject: PooledConnectionImpl should call connectionErrorOccurred on listeners on connection error (per spec)