Thread: JDBC driver, PGSQL 7.3.2 and accents characters
Hi, I've nice problems with the jdbc driver. I've tried with the jdbc2, jdbc, latest stable and also development release. I've a database in postgres with some varchar fields. The database is SQL_ASCII as char encoding. In that varchar fields I've stored also names with accents such è, à, ì etc... They work fine using the psql program, and also linking tables to access through the odbc driver. But when I try to use jdbc to connect to database my accents fail to load. For example I have the string 'Forlì Sud'. When I try to system.out.println this string catched by jdbc with rs.getString, I see this string instead of the original one: 'Forl?ud'. I've tried also to use different character sets in the connection url like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything. Please help me, because this bug makes java and jdbc pretty unusable to connect pgsql databases. Bye, Romaz
Davide, Those characters are not part of the SQL_ASCII character set. SQL_ASCII is 7bit ascii, the characters you are trying to use are all 8bit characters. You need to create your database with a character set that supports the characters you are trying to store. LATIN1 or UNICODE would be good choices. thanks, --Barry Davide Romanini wrote: > Hi, > > I've nice problems with the jdbc driver. I've tried with the jdbc2, > jdbc, latest stable and also development release. > I've a database in postgres with some varchar fields. The database is > SQL_ASCII as char encoding. In that varchar fields I've stored also > names with accents such è, à, ì etc... They work fine using the psql > program, and also linking tables to access through the odbc driver. But > when I try to use jdbc to connect to database my accents fail to load. > For example I have the string 'Forlì Sud'. When I try to > system.out.println this string catched by jdbc with rs.getString, I see > this string instead of the original one: 'Forl?ud'. > I've tried also to use different character sets in the connection url > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything. > > Please help me, because this bug makes java and jdbc pretty unusable to > connect pgsql databases. > > Bye, Romaz > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
Mario, I do not know of any easy way. I believe the character set for a database can only be set when the database is created. So you would need to dump the data, drop and recreate the database with the correct character set and then reload the data. thanks, --Barry Mario Rodriguez Villanea wrote: > Hi I've got the same problem, but my database is currently working, and > I'm wondering if there is way to change with an SQL command > like ALTER DATABASE ENCODING 'LATIN1' or somathing like that > > thanks > > -----Original Message----- > From: pgsql-jdbc-owner@postgresql.org > [mailto:pgsql-jdbc-owner@postgresql.org]On Behalf Of Barry Lind > Sent: Wednesday, March 19, 2003 10:37 AM > To: Davide Romanini > Cc: pgsql-jdbc@postgresql.org > Subject: Re: [JDBC] JDBC driver, PGSQL 7.3.2 and accents characters > > > Davide, > > Those characters are not part of the SQL_ASCII character set. SQL_ASCII > > is 7bit ascii, the characters you are trying to use are all 8bit > characters. You need to create your database with a character set that > supports the characters you are trying to store. LATIN1 or UNICODE > would be good choices. > > thanks, > --Barry > > Davide Romanini wrote: > >>Hi, >> >>I've nice problems with the jdbc driver. I've tried with the jdbc2, >>jdbc, latest stable and also development release. >>I've a database in postgres with some varchar fields. The database is >>SQL_ASCII as char encoding. In that varchar fields I've stored also >>names with accents such è, à, ì etc... They work fine using the psql >>program, and also linking tables to access through the odbc driver. > > But > >>when I try to use jdbc to connect to database my accents fail to load. > > >>For example I have the string 'Forlì Sud'. When I try to >>system.out.println this string catched by jdbc with rs.getString, I > > see > >>this string instead of the original one: 'Forl?ud'. >>I've tried also to use different character sets in the connection url >>like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything. >> >>Please help me, because this bug makes java and jdbc pretty unusable > > to > >>connect pgsql databases. >> >>Bye, Romaz >> >> >>---------------------------(end of > > broadcast)--------------------------- > >>TIP 6: Have you searched our list archives? >> >>http://archives.postgresql.org >> > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
On Wednesday 19 March 2003 01:35, Davide Romanini wrote: > I've nice problems with the jdbc driver. I've tried with the jdbc2, > jdbc, latest stable and also development release. > I've a database in postgres with some varchar fields. The database is > SQL_ASCII as char encoding. In that varchar fields I've stored also > names with accents such è, à, ì etc... They work fine using the psql > program, and also linking tables to access through the odbc driver. But > when I try to use jdbc to connect to database my accents fail to load. > For example I have the string 'Forlì Sud'. When I try to > system.out.println this string catched by jdbc with rs.getString, I see > this string instead of the original one: 'Forl?ud'. > I've tried also to use different character sets in the connection url > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything. > > Please help me, because this bug makes java and jdbc pretty unusable to > connect pgsql databases. I doubt very much it's a bug in pgsql. It's probably more than likely a misunderstanding on your part about how character sets work in Java. I'm guessing Barry Lind didn't read the last part of your message, or he probably would've known what the problem was, as well. He is correct however, in stating that PostgreSQL probably will not allow you to save accented characters in a database with an encoding of SQL_ASCII. You'll need to use SQL_UNICODE(?) as the encoding, more than likely. Because your character set is iso-8859-1 however, you'll need to convert the strings to Unicode first, before saving to the database. You do this as follows: byte[] text=myString.getBytes("iso-8859-1") ; String myNewString=new String(text,"utf-8") ; stmt.setString(x,myNewString) ; To get it back out, try the following: String myString=rs.getString(x) ; byte[] text=myString.getBytes("utf-8") ; String myNewString=new String(text,"iso-8859-1") ; If you want your code to be portable, I should insist on you specifying the character set every time for getting bytes and creating strings. The reason being is that different operating environments will have different default character sets. For instance, in our office, I've got three default character sets. On one Linux machine, it's ISO-8859-1, on another, it's GB2312-80, and on the Windows machines it's CP859(?). The codepage in question on Windows is Microsoftese for ISO-8859-1/Latin 1/US ASCII with Latin A, depending on which standard you're used to. It's also often referred to as CP437 (DOS and OS/2).
Davide Romanini wrote: > Hi, > > I've nice problems with the jdbc driver. I've tried with the jdbc2, > jdbc, latest stable and also development release. > I've a database in postgres with some varchar fields. The database is > SQL_ASCII as char encoding. In that varchar fields I've stored also > names with accents such è, à, ì etc... They work fine using the psql > program, and also linking tables to access through the odbc driver. But > when I try to use jdbc to connect to database my accents fail to load. > For example I have the string 'Forlì Sud'. When I try to > system.out.println this string catched by jdbc with rs.getString, I see > this string instead of the original one: 'Forl?ud'. > I've tried also to use different character sets in the connection url > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything. > Try to create the database with the LATIN9 encoding: 'createdb -E LATIN9 db-name' Then in Java set the default locale as: new Locale( "it", "IT", "EURO" ); (or whatever country you want -- don't forget to set the default Timezone too, or you'll get erros from the JDBC driver) It works with me :) -- Carlos Correia MEMÓRIA PERSISTENTE, Lda. e-mail: carlos@m16e.com URL: http://www.m16e.com
Your procedure makes absolutely no sense, as Strings are always stored as Unicode in Java. So what you propose is basically this: - you have a Unicode-encoded string in the first place; - encode that string to the "text" byte array using "ISO-8859-1"; - read back the "ISO-8859-1"-encoded byte array to a Unicode String interpreting the bytes using "UTF-8" encoding... which will more than likely give you errors, because it is NOT "UTF-8". HTH Csaba. On Thu, 2003-03-20 at 00:11, Daniel Bruce Lynes wrote: > On Wednesday 19 March 2003 01:35, Davide Romanini wrote: > > > I've nice problems with the jdbc driver. I've tried with the jdbc2, > > jdbc, latest stable and also development release. > > I've a database in postgres with some varchar fields. The database is > > SQL_ASCII as char encoding. In that varchar fields I've stored also > > names with accents such è, à, ì etc... They work fine using the psql > > program, and also linking tables to access through the odbc driver. But > > when I try to use jdbc to connect to database my accents fail to load. > > For example I have the string 'Forlì Sud'. When I try to > > system.out.println this string catched by jdbc with rs.getString, I see > > this string instead of the original one: 'Forl?ud'. > > I've tried also to use different character sets in the connection url > > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything. > > > > Please help me, because this bug makes java and jdbc pretty unusable to > > connect pgsql databases. > > I doubt very much it's a bug in pgsql. It's probably more than likely a > misunderstanding on your part about how character sets work in Java. > > I'm guessing Barry Lind didn't read the last part of your message, or he > probably would've known what the problem was, as well. > > He is correct however, in stating that PostgreSQL probably will not allow you > to save accented characters in a database with an encoding of SQL_ASCII. > You'll need to use SQL_UNICODE(?) as the encoding, more than likely. > > Because your character set is iso-8859-1 however, you'll need to convert the > strings to Unicode first, before saving to the database. > > You do this as follows: > > byte[] text=myString.getBytes("iso-8859-1") ; > String myNewString=new String(text,"utf-8") ; > stmt.setString(x,myNewString) ; > > To get it back out, try the following: > > String myString=rs.getString(x) ; > byte[] text=myString.getBytes("utf-8") ; > String myNewString=new String(text,"iso-8859-1") ; > > If you want your code to be portable, I should insist on you specifying the > character set every time for getting bytes and creating strings. The reason > being is that different operating environments will have different default > character sets. For instance, in our office, I've got three default > character sets. On one Linux machine, it's ISO-8859-1, on another, it's > GB2312-80, and on the Windows machines it's CP859(?). The codepage in > question on Windows is Microsoftese for ISO-8859-1/Latin 1/US ASCII with > Latin A, depending on which standard you're used to. It's also often > referred to as CP437 (DOS and OS/2). > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
Barry Lind wrote: > Davide, > > Those characters are not part of the SQL_ASCII character set. SQL_ASCII > is 7bit ascii, the characters you are trying to use are all 8bit > characters. You need to create your database with a character set that > supports the characters you are trying to store. LATIN1 or UNICODE > would be good choices. > > thanks, > --Barry You surely are right, but... my data is already stored in the database. And when I work with them with psql or the odbc driver (linking tables in M$ Access) my accents are there without any problem. Why the jdbc driver doesn't work while the others program all work? However thanks for your suggestion. Bye, Romaz
Davide Romanini wrote: > You surely are right, but... my data is already stored in the database. > And when I work with them with psql or the odbc driver (linking tables > in M$ Access) my accents are there without any problem. Why the jdbc > driver doesn't work while the others program all work? Java uses UCS2 as its internal character set. So jdbc must do a character set translation for all string data. In psql (and probably odbc), by default no translation is needed. --Barry
Hi I've got the same problem, but my database is currently working, and I'm wondering if there is way to change with an SQL command like ALTER DATABASE ENCODING 'LATIN1' or somathing like that thanks -----Original Message----- From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc-owner@postgresql.org]On Behalf Of Barry Lind Sent: Wednesday, March 19, 2003 10:37 AM To: Davide Romanini Cc: pgsql-jdbc@postgresql.org Subject: Re: [JDBC] JDBC driver, PGSQL 7.3.2 and accents characters Davide, Those characters are not part of the SQL_ASCII character set. SQL_ASCII is 7bit ascii, the characters you are trying to use are all 8bit characters. You need to create your database with a character set that supports the characters you are trying to store. LATIN1 or UNICODE would be good choices. thanks, --Barry Davide Romanini wrote: > Hi, > > I've nice problems with the jdbc driver. I've tried with the jdbc2, > jdbc, latest stable and also development release. > I've a database in postgres with some varchar fields. The database is > SQL_ASCII as char encoding. In that varchar fields I've stored also > names with accents such è, à, ì etc... They work fine using the psql > program, and also linking tables to access through the odbc driver. But > when I try to use jdbc to connect to database my accents fail to load. > For example I have the string 'Forlì Sud'. When I try to > system.out.println this string catched by jdbc with rs.getString, I see > this string instead of the original one: 'Forl?ud'. > I've tried also to use different character sets in the connection url > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but didn't change anything. > > Please help me, because this bug makes java and jdbc pretty unusable to > connect pgsql databases. > > Bye, Romaz > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
On Sat, 2003-03-22 at 00:11, Davide Romanini wrote: > > You surely are right, but... my data is already stored in the database. > And when I work with them with psql or the odbc driver (linking tables > in M$ Access) my accents are there without any problem. Why the jdbc > driver doesn't work while the others program all work? Because they aren't escaped automatically. Dreamweaver JSP does it right. I have added a function to make searching with accents transparent too. One thing I am still having problems with is inserting ' into the database. My client is escaping manually as in \' Cheers Tony Grant -- www.tgds.net Library management software toolkit, redhat linux on Sony Vaio C1XD, Dreamweaver MX with Tomcat and PostgreSQL
Hello, I've been testing the 7.3 version of the dbms and jdbc driver fo more than one month, the application is a webapp under Tomcat. Basically we had problems in migrating from 7.2 databases since we used MULE_INTERNAL, which seems to be no more supported by the 7.3 JDBC driver. After some work in trying to solve the poblem, the best solution I found was to change our practice: we now dynamically force the HTML encoding to be the same of the DBMS' one. So, if I 'createdb -E UNICODE', my html tags will specify "charset=UTF-8" and all browsers switch to Unicode automatically. This is easy to obtain on server-side, i.e. storing the encoding in a properties file, and should work well: hypotetical users in different countries should see exactly the same strings and insert strings using always the same encoding. Porting databases was more problematic: I was able to port existing MULE_INTERNAL databases to new UNICODE ones thanks to our custom db porting tool, passing through another DBMS, since pg_dump and pg_restore between two differently encoded databases did not work. [OT for jdbc] Is there any way to do such porting using postgreSQL-related tools? Bye Marco Barry Lind wrote: > Java uses UCS2 as its internal character set. So jdbc must do a > character set translation for all string data. In psql (and probably > odbc), by default no translation is needed. >
I had the same problem a year ago, I look out how can I insert accented spanish words. I also believed I have problems in my Java Source code. The problem is with the JDBC connector. If you are using Dreamweaver you need to set the connector as the following link shows. Please check it, I hope this can be useful to you. I kept it. Eduardo Spremolla <edspremolla@antel.com.uy> sent me this link the connector is in the bottom of the page. http://jdbc.postgresql.org/doc.html Good luck Adavila --- Csaba Nagy <nagy@ecircle-ag.com> wrote: > Your procedure makes absolutely no sense, as Strings > are always stored > as Unicode in Java. So what you propose is basically > this: > - you have a Unicode-encoded string in the first > place; > - encode that string to the "text" byte array using > "ISO-8859-1"; > - read back the "ISO-8859-1"-encoded byte array to a > Unicode String > interpreting the bytes using "UTF-8" encoding... > which will more than > likely give you errors, because it is NOT "UTF-8". > > HTH > Csaba. > > > On Thu, 2003-03-20 at 00:11, Daniel Bruce Lynes > wrote: > > On Wednesday 19 March 2003 01:35, Davide Romanini > wrote: > > > > > I've nice problems with the jdbc driver. I've > tried with the jdbc2, > > > jdbc, latest stable and also development > release. > > > I've a database in postgres with some varchar > fields. The database is > > > SQL_ASCII as char encoding. In that varchar > fields I've stored also > > > names with accents such ��, ��, �� etc... They > work fine using the psql > > > program, and also linking tables to access > through the odbc driver. But > > > when I try to use jdbc to connect to database my > accents fail to load. > > > For example I have the string 'Forl�� Sud'. When > I try to > > > system.out.println this string catched by jdbc > with rs.getString, I see > > > this string instead of the original one: > 'Forl?ud'. > > > I've tried also to use different character sets > in the connection url > > > like ISO-8859-1, UNICODE, WIN, SQL_ASCII but > didn't change anything. > > > > > > Please help me, because this bug makes java and > jdbc pretty unusable to > > > connect pgsql databases. > > > > I doubt very much it's a bug in pgsql. It's > probably more than likely a > > misunderstanding on your part about how character > sets work in Java. > > > > I'm guessing Barry Lind didn't read the last part > of your message, or he > > probably would've known what the problem was, as > well. > > > > He is correct however, in stating that PostgreSQL > probably will not allow you > > to save accented characters in a database with an > encoding of SQL_ASCII. > > You'll need to use SQL_UNICODE(?) as the encoding, > more than likely. > > > > Because your character set is iso-8859-1 however, > you'll need to convert the > > strings to Unicode first, before saving to the > database. > > > > You do this as follows: > > > > byte[] text=myString.getBytes("iso-8859-1") ; > > String myNewString=new String(text,"utf-8") ; > > stmt.setString(x,myNewString) ; > > > > To get it back out, try the following: > > > > String myString=rs.getString(x) ; > > byte[] text=myString.getBytes("utf-8") ; > > String myNewString=new String(text,"iso-8859-1") > ; > > > > If you want your code to be portable, I should > insist on you specifying the > > character set every time for getting bytes and > creating strings. The reason > > being is that different operating environments > will have different default > > character sets. For instance, in our office, I've > got three default > > character sets. On one Linux machine, it's > ISO-8859-1, on another, it's > > GB2312-80, and on the Windows machines it's > CP859(?). The codepage in > > question on Windows is Microsoftese for > ISO-8859-1/Latin 1/US ASCII with > > Latin A, depending on which standard you're used > to. It's also often > > referred to as CP437 (DOS and OS/2). > > > > ---------------------------(end of > broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html __________________________________________________ Do you Yahoo!? Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop! http://platinum.yahoo.com