Thread: JDBC driver, PGSQL 7.3.2 and accents characters

JDBC driver, PGSQL 7.3.2 and accents characters

From

Davide Romanini

Date:

19 March 2003, 04:33:44

Hi,

I've nice problems with the jdbc driver. I've tried with the jdbc2,
jdbc, latest stable and also development release.
I've a database in postgres with some varchar fields. The database is
SQL_ASCII as char encoding. In that varchar fields I've stored also
names with accents such è, à, ì etc... They work fine using the psql
program, and also linking tables to access through the odbc driver. But
when I try to use jdbc to connect to database my accents fail to load.
For example I have the string 'Forlì Sud'. When I try to
system.out.println this string catched by jdbc with rs.getString, I see
this string instead of the original one: 'Forl?ud'.
I've tried also to use different character sets in the connection url
like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.

Please help me, because this bug makes java and jdbc pretty unusable to
connect pgsql databases.

Bye, Romaz

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Barry Lind

Date:

19 March 2003, 12:46:07

Davide,

Those characters are not part of the SQL_ASCII character set.  SQL_ASCII
is 7bit ascii, the characters you are trying to use are all 8bit
characters.  You need to create your database with a character set that
supports the characters you are trying to store.  LATIN1 or UNICODE
would be good choices.

thanks,
--Barry

Davide Romanini wrote:
> Hi,
>
> I've nice problems with the jdbc driver. I've tried with the jdbc2,
> jdbc, latest stable and also development release.
> I've a database in postgres with some varchar fields. The database is
> SQL_ASCII as char encoding. In that varchar fields I've stored also
> names with accents such è, à, ì etc... They work fine using the psql
> program, and also linking tables to access through the odbc driver. But
> when I try to use jdbc to connect to database my accents fail to load.
> For example I have the string 'Forlì Sud'. When I try to
> system.out.println this string catched by jdbc with rs.getString, I see
> this string instead of the original one: 'Forl?ud'.
> I've tried also to use different character sets in the connection url
> like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.
>
> Please help me, because this bug makes java and jdbc pretty unusable to
> connect pgsql databases.
>
> Bye, Romaz
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Barry Lind

Date:

19 March 2003, 15:18:57

Mario,

I do not know of any easy way.  I believe the character set for a
database can only be set when the database is created.  So you would
need to dump the data, drop and recreate the database with the correct
character set and then reload the data.

thanks,
--Barry


Mario Rodriguez Villanea wrote:
> Hi I've got the same problem, but my database is currently working, and
> I'm wondering if there is way to change with an SQL command
>  like ALTER DATABASE ENCODING 'LATIN1' or somathing like that
>
> thanks
>
> -----Original Message-----
> From: pgsql-jdbc-owner@postgresql.org
> [mailto:pgsql-jdbc-owner@postgresql.org]On Behalf Of Barry Lind
> Sent: Wednesday, March 19, 2003 10:37 AM
> To: Davide Romanini
> Cc: pgsql-jdbc@postgresql.org
> Subject: Re: [JDBC] JDBC driver, PGSQL 7.3.2 and accents characters
>
>
> Davide,
>
> Those characters are not part of the SQL_ASCII character set.  SQL_ASCII
>
> is 7bit ascii, the characters you are trying to use are all 8bit
> characters.  You need to create your database with a character set that
> supports the characters you are trying to store.  LATIN1 or UNICODE
> would be good choices.
>
> thanks,
> --Barry
>
> Davide Romanini wrote:
>
>>Hi,
>>
>>I've nice problems with the jdbc driver. I've tried with the jdbc2,
>>jdbc, latest stable and also development release.
>>I've a database in postgres with some varchar fields. The database is
>>SQL_ASCII as char encoding. In that varchar fields I've stored also
>>names with accents such è, à, ì etc... They work fine using the psql
>>program, and also linking tables to access through the odbc driver.
>
> But
>
>>when I try to use jdbc to connect to database my accents fail to load.
>
>
>>For example I have the string 'Forlì Sud'. When I try to
>>system.out.println this string catched by jdbc with rs.getString, I
>
> see
>
>>this string instead of the original one: 'Forl?ud'.
>>I've tried also to use different character sets in the connection url
>>like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.
>>
>>Please help me, because this bug makes java and jdbc pretty unusable
>
> to
>
>>connect pgsql databases.
>>
>>Bye, Romaz
>>
>>
>>---------------------------(end of
>
> broadcast)---------------------------
>
>>TIP 6: Have you searched our list archives?
>>
>>http://archives.postgresql.org
>>
>
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Daniel Bruce Lynes

Date:

19 March 2003, 17:59:07

On Wednesday 19 March 2003 01:35, Davide Romanini wrote:

> I've nice problems with the jdbc driver. I've tried with the jdbc2,
> jdbc, latest stable and also development release.
> I've a database in postgres with some varchar fields. The database is
> SQL_ASCII as char encoding. In that varchar fields I've stored also
> names with accents such è, à, ì etc... They work fine using the psql
> program, and also linking tables to access through the odbc driver. But
> when I try to use jdbc to connect to database my accents fail to load.
> For example I have the string 'Forlì Sud'. When I try to
> system.out.println this string catched by jdbc with rs.getString, I see
> this string instead of the original one: 'Forl?ud'.
> I've tried also to use different character sets in the connection url
> like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.
>
> Please help me, because this bug makes java and jdbc pretty unusable to
> connect pgsql databases.

I doubt very much it's a bug in pgsql.  It's probably more than likely a
misunderstanding on your part about how character sets work in Java.

I'm guessing Barry Lind didn't read the last part of your message, or he
probably would've known what the problem was, as well.

He is correct however, in stating that PostgreSQL probably will not allow you
to save accented characters in a database with an encoding of SQL_ASCII.
You'll need to use SQL_UNICODE(?) as the encoding, more than likely.

Because your character set is iso-8859-1 however, you'll need to convert the
strings to Unicode first, before saving to the database.

You do this as follows:

    byte[] text=myString.getBytes("iso-8859-1") ;
    String myNewString=new String(text,"utf-8") ;
    stmt.setString(x,myNewString) ;

To get it back out, try the following:

    String myString=rs.getString(x) ;
    byte[] text=myString.getBytes("utf-8") ;
    String myNewString=new String(text,"iso-8859-1") ;

If you want your code to be portable, I should insist on you specifying the
character set every time for getting bytes and creating strings.  The reason
being is that different operating environments will have different default
character sets.  For instance, in our office, I've got three default
character sets.  On one Linux machine, it's ISO-8859-1, on another, it's
GB2312-80, and on the Windows machines it's CP859(?).  The codepage in
question on Windows is Microsoftese for ISO-8859-1/Latin 1/US ASCII with
Latin A, depending on which standard you're used to.  It's also often
referred to as CP437 (DOS and OS/2).

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Carlos Correia

Date:

19 March 2003, 20:25:22

Davide Romanini wrote:
> Hi,
>
> I've nice problems with the jdbc driver. I've tried with the jdbc2,
> jdbc, latest stable and also development release.
> I've a database in postgres with some varchar fields. The database is
> SQL_ASCII as char encoding. In that varchar fields I've stored also
> names with accents such è, à, ì etc... They work fine using the psql
> program, and also linking tables to access through the odbc driver. But
> when I try to use jdbc to connect to database my accents fail to load.
> For example I have the string 'Forlì Sud'. When I try to
> system.out.println this string catched by jdbc with rs.getString, I see
> this string instead of the original one: 'Forl?ud'.
> I've tried also to use different character sets in the connection url
> like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.
>
Try to create the database with the LATIN9 encoding:

'createdb -E LATIN9 db-name'

Then in Java set the default locale as:

new Locale( "it", "IT", "EURO" );

(or whatever country you want -- don't forget to set the default
Timezone too, or you'll get erros from the JDBC driver)

It works with me :)


--
Carlos Correia
MEMÓRIA PERSISTENTE, Lda.
e-mail: carlos@m16e.com
URL: http://www.m16e.com

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Csaba Nagy

Date:

20 March 2003, 04:20:53

Your procedure makes absolutely no sense, as Strings are always stored
as Unicode in Java. So what you propose is basically this:
- you have a Unicode-encoded string in the first place;
- encode that string to the "text" byte array using "ISO-8859-1";
- read back the "ISO-8859-1"-encoded byte array to a Unicode String
interpreting the bytes using "UTF-8" encoding... which will more than
likely give you errors, because it is NOT "UTF-8".

HTH
Csaba.


On Thu, 2003-03-20 at 00:11, Daniel Bruce Lynes wrote:
> On Wednesday 19 March 2003 01:35, Davide Romanini wrote:
>
> > I've nice problems with the jdbc driver. I've tried with the jdbc2,
> > jdbc, latest stable and also development release.
> > I've a database in postgres with some varchar fields. The database is
> > SQL_ASCII as char encoding. In that varchar fields I've stored also
> > names with accents such è, à, ì etc... They work fine using the psql
> > program, and also linking tables to access through the odbc driver. But
> > when I try to use jdbc to connect to database my accents fail to load.
> > For example I have the string 'Forlì Sud'. When I try to
> > system.out.println this string catched by jdbc with rs.getString, I see
> > this string instead of the original one: 'Forl?ud'.
> > I've tried also to use different character sets in the connection url
> > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.
> >
> > Please help me, because this bug makes java and jdbc pretty unusable to
> > connect pgsql databases.
>
> I doubt very much it's a bug in pgsql.  It's probably more than likely a
> misunderstanding on your part about how character sets work in Java.
>
> I'm guessing Barry Lind didn't read the last part of your message, or he
> probably would've known what the problem was, as well.
>
> He is correct however, in stating that PostgreSQL probably will not allow you
> to save accented characters in a database with an encoding of SQL_ASCII.
> You'll need to use SQL_UNICODE(?) as the encoding, more than likely.
>
> Because your character set is iso-8859-1 however, you'll need to convert the
> strings to Unicode first, before saving to the database.
>
> You do this as follows:
>
>     byte[] text=myString.getBytes("iso-8859-1") ;
>     String myNewString=new String(text,"utf-8") ;
>     stmt.setString(x,myNewString) ;
>
> To get it back out, try the following:
>
>     String myString=rs.getString(x) ;
>     byte[] text=myString.getBytes("utf-8") ;
>     String myNewString=new String(text,"iso-8859-1") ;
>
> If you want your code to be portable, I should insist on you specifying the
> character set every time for getting bytes and creating strings.  The reason
> being is that different operating environments will have different default
> character sets.  For instance, in our office, I've got three default
> character sets.  On one Linux machine, it's ISO-8859-1, on another, it's
> GB2312-80, and on the Windows machines it's CP859(?).  The codepage in
> question on Windows is Microsoftese for ISO-8859-1/Latin 1/US ASCII with
> Latin A, depending on which standard you're used to.  It's also often
> referred to as CP437 (DOS and OS/2).
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Davide Romanini

Date:

21 March 2003, 18:10:18

Barry Lind wrote:
> Davide,
>
> Those characters are not part of the SQL_ASCII character set.  SQL_ASCII
> is 7bit ascii, the characters you are trying to use are all 8bit
> characters.  You need to create your database with a character set that
> supports the characters you are trying to store.  LATIN1 or UNICODE
> would be good choices.
>
> thanks,
> --Barry

You surely are right, but... my data is already stored in the database.
And when I work with them with psql or the odbc driver (linking tables
in M$ Access) my accents are there without any problem. Why the jdbc
driver doesn't work while the others program all work?

However thanks for your suggestion.

Bye, Romaz

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Barry Lind

Date:

21 March 2003, 18:18:55

Davide Romanini wrote:
> You surely are right, but... my data is already stored in the database.
> And when I work with them with psql or the odbc driver (linking tables
> in M$ Access) my accents are there without any problem. Why the jdbc
> driver doesn't work while the others program all work?

Java uses UCS2 as its internal character set.  So jdbc must do a
character set translation for all string data.  In psql (and probably
odbc), by default no translation is needed.

--Barry

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

"Mario Rodriguez Villanea"

Date:

21 March 2003, 20:10:18

Hi I've got the same problem, but my database is currently working, and
I'm wondering if there is way to change with an SQL command
 like ALTER DATABASE ENCODING 'LATIN1' or somathing like that

thanks

-----Original Message-----
From: pgsql-jdbc-owner@postgresql.org
[mailto:pgsql-jdbc-owner@postgresql.org]On Behalf Of Barry Lind
Sent: Wednesday, March 19, 2003 10:37 AM
To: Davide Romanini
Cc: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] JDBC driver, PGSQL 7.3.2 and accents characters


Davide,

Those characters are not part of the SQL_ASCII character set.  SQL_ASCII

is 7bit ascii, the characters you are trying to use are all 8bit
characters.  You need to create your database with a character set that
supports the characters you are trying to store.  LATIN1 or UNICODE
would be good choices.

thanks,
--Barry

Davide Romanini wrote:
> Hi,
>
> I've nice problems with the jdbc driver. I've tried with the jdbc2,
> jdbc, latest stable and also development release.
> I've a database in postgres with some varchar fields. The database is
> SQL_ASCII as char encoding. In that varchar fields I've stored also
> names with accents such è, à, ì etc... They work fine using the psql
> program, and also linking tables to access through the odbc driver.
But
> when I try to use jdbc to connect to database my accents fail to load.

> For example I have the string 'Forlì Sud'. When I try to
> system.out.println this string catched by jdbc with rs.getString, I
see
> this string instead of the original one: 'Forl?ud'.
> I've tried also to use different character sets in the connection url
> like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything.
>
> Please help me, because this bug makes java and jdbc pretty unusable
to
> connect pgsql databases.
>
> Bye, Romaz
>
>
> ---------------------------(end of
broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>




---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Tony Grant

Date:

22 March 2003, 02:56:00

On Sat, 2003-03-22 at 00:11, Davide Romanini wrote:

>
> You surely are right, but... my data is already stored in the database.
> And when I work with them with psql or the odbc driver (linking tables
> in M$ Access) my accents are there without any problem. Why the jdbc
> driver doesn't work while the others program all work?

Because they aren't escaped automatically.

Dreamweaver JSP does it right. I have added a function to make searching
with accents transparent too. One thing I am still having problems with
is inserting ' into the database. My client is escaping manually as in
\'

Cheers

Tony Grant
--
www.tgds.net Library management software toolkit,
redhat linux on Sony Vaio C1XD,
Dreamweaver MX with Tomcat and PostgreSQL

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Marco Trevisan

Date:

22 March 2003, 06:13:32

Hello,

I've been testing the 7.3 version of the dbms and jdbc driver fo more
than one month, the application is a webapp under Tomcat.
Basically we had problems in migrating from 7.2 databases since we used
MULE_INTERNAL, which seems to be no more supported by the 7.3 JDBC driver.
After some work in trying to solve the poblem, the best solution I found
was to change our practice: we now dynamically force the HTML encoding
to be the same of the DBMS' one.
So, if I 'createdb -E UNICODE', my html tags will specify
"charset=UTF-8" and all browsers switch to Unicode automatically. This
is easy to obtain on server-side, i.e. storing the encoding in a
properties file, and should work well: hypotetical users in different
countries should see exactly the same strings and insert strings using
always the same encoding.

Porting databases was more problematic: I was able to port existing
MULE_INTERNAL databases to new UNICODE ones thanks to our custom db
porting tool, passing through another DBMS, since pg_dump and pg_restore
between two differently encoded databases did not work.
[OT for jdbc] Is there any way to do such porting using
postgreSQL-related tools?

Bye
  Marco

Barry Lind wrote:

> Java uses UCS2 as its internal character set.  So jdbc must do a
> character set translation for all string data.  In psql (and probably
> odbc), by default no translation is needed.
>

Re: JDBC driver, PGSQL 7.3.2 and accents characters

From

Andres Davila

Date:

24 March 2003, 20:07:47

I had the same problem a year ago, I look out how can
I insert accented spanish words.

I also believed I have problems in my Java Source
code.  The problem is with the JDBC connector.

If you are using Dreamweaver you need to set the
connector as the following link shows.

Please check it, I hope this can be useful to you.  I
kept it.
Eduardo Spremolla <edspremolla@antel.com.uy> sent me
this link the connector is in the bottom of the page.

http://jdbc.postgresql.org/doc.html

Good luck

Adavila
--- Csaba Nagy <nagy@ecircle-ag.com> wrote:
> Your procedure makes absolutely no sense, as Strings
> are always stored
> as Unicode in Java. So what you propose is basically
> this:
> - you have a Unicode-encoded string in the first
> place;
> - encode that string to the "text" byte array using
> "ISO-8859-1";
> - read back the "ISO-8859-1"-encoded byte array to a
> Unicode String
> interpreting the bytes using "UTF-8" encoding...
> which will more than
> likely give you errors, because it is NOT "UTF-8".
>
> HTH
> Csaba.
>
>
> On Thu, 2003-03-20 at 00:11, Daniel Bruce Lynes
> wrote:
> > On Wednesday 19 March 2003 01:35, Davide Romanini
> wrote:
> >
> > > I've nice problems with the jdbc driver. I've
> tried with the jdbc2,
> > > jdbc, latest stable and also development
> release.
> > > I've a database in postgres with some varchar
> fields. The database is
> > > SQL_ASCII as char encoding. In that varchar
> fields I've stored also
> > > names with accents such ��, ��, �� etc... They
> work fine using the psql
> > > program, and also linking tables to access
> through the odbc driver. But
> > > when I try to use jdbc to connect to database my
> accents fail to load.
> > > For example I have the string 'Forl�� Sud'. When
> I try to
> > > system.out.println this string catched by jdbc
> with rs.getString, I see
> > > this string instead of the original one:
> 'Forl?ud'.
> > > I've tried also to use different character sets
> in the connection url
> > > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but
> didn't change anything.
> > >
> > > Please help me, because this bug makes java and
> jdbc pretty unusable to
> > > connect pgsql databases.
> >
> > I doubt very much it's a bug in pgsql.  It's
> probably more than likely a
> > misunderstanding on your part about how character
> sets work in Java.
> >
> > I'm guessing Barry Lind didn't read the last part
> of your message, or he
> > probably would've known what the problem was, as
> well.
> >
> > He is correct however, in stating that PostgreSQL
> probably will not allow you
> > to save accented characters in a database with an
> encoding of SQL_ASCII.
> > You'll need to use SQL_UNICODE(?) as the encoding,
> more than likely.
> >
> > Because your character set is iso-8859-1 however,
> you'll need to convert the
> > strings to Unicode first, before saving to the
> database.
> >
> > You do this as follows:
> >
> >     byte[] text=myString.getBytes("iso-8859-1") ;
> >     String myNewString=new String(text,"utf-8") ;
> >     stmt.setString(x,myNewString) ;
> >
> > To get it back out, try the following:
> >
> >     String myString=rs.getString(x) ;
> >     byte[] text=myString.getBytes("utf-8") ;
> >     String myNewString=new String(text,"iso-8859-1")
> ;
> >
> > If you want your code to be portable, I should
> insist on you specifying the
> > character set every time for getting bytes and
> creating strings.  The reason
> > being is that different operating environments
> will have different default
> > character sets.  For instance, in our office, I've
> got three default
> > character sets.  On one Linux machine, it's
> ISO-8859-1, on another, it's
> > GB2312-80, and on the Windows machines it's
> CP859(?).  The codepage in
> > question on Windows is Microsoftese for
> ISO-8859-1/Latin 1/US ASCII with
> > Latin A, depending on which standard you're used
> to.  It's also often
> > referred to as CP437 (DOS and OS/2).
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html


__________________________________________________
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
http://platinum.yahoo.com