Thread: Charset problem on WHERE clause
Hi, I'm pretty new to PostgreSQL as well to it's JDBC driver. For now I'm using PostgreSQL 7.3.6 version under Red Hat ES 3.0. The database is created with SQL_ASCII encoding. I'm retriving data from the database with the pg74.214.jdbc3.jar driver. Some fields contains values with accents (characters like Ç, Ã, Õ, etc.) ... I've set the connection string with jdbc:postgresql://10.100.1.11:5432/mydatabase?charSet=LATIN1 On java code I must get the fields with new String(result.getBytes(1),"ISO-8859-1") to have accentued chars correctly displayed ..... My problem is when I get an accentued character on a WHERE expression, it doesn't return any value. I've tried field IN (to_char('MANUTENÇÃO', 'LATIN1')) ... but with no success. Any idea or help on this? On time ... using pgAdminIII or the line command psql tool both works with the accents on WHERE clause. Thanks
On Mon, 26 Jul 2004, smota wrote: > Hi, > > I'm pretty new to PostgreSQL as well to it's JDBC driver. > > For now I'm using PostgreSQL 7.3.6 version under Red Hat ES 3.0. > The database is created with SQL_ASCII encoding. > I'm retriving data from the database with the pg74.214.jdbc3.jar driver. > > jdbc:postgresql://10.100.1.11:5432/mydatabase?charSet=LATIN1 You should not use a SQL_ASCII database. The JDBC driver requires you database to use a proper encoding for your data. The ?charSet url parameter was designed to work around this problem for <= 7.2 servers which didn't come with multibyte encoding support compiled by default, but it is ignored in => 7.3 servers so it is useless here. Kris Jurka
Kris Jurka wrote: > You should not use a SQL_ASCII database. The JDBC driver requires you > database to use a proper encoding for your data. The ?charSet url > parameter was designed to work around this problem for <= 7.2 servers > which didn't come with multibyte encoding support compiled by default, but > it is ignored in => 7.3 servers so it is useless here. I wonder if it's worth supporting the charSet parameter even for >= 7.3: set client_encoding explicitly to SQL_ASCII (which I believe means "no translation") and do the translation to Unicode on the JVM side using whatever charset the user provided. I think most of the encoding details are now isolated from the rest of the protocol logic, so it wouldn't be a very invasive change. My only concern is that it'd encourage people to keep their DBs as SQL_ASCII .. which is just delaying the problem. -O
Hi, Thanks Kris for your fast response .... I've recreated the database using LATIN1 and everything is working just fine. > I wonder if it's worth supporting the charSet parameter even for >= 7.3: > set client_encoding explicitly to SQL_ASCII (which I believe means "no > translation") and do the translation to Unicode on the JVM side using > whatever charset the user provided. I think most of the encoding details > are now isolated from the rest of the protocol logic, so it wouldn't be > a very invasive change. I really don't see this as a good idea :O .... With the LATIN1 charset I didn't use Java conversion (with new String(rs.getBytes(1)....etc. etc.) my code was cleaner and easier, so, It was a good thing make me use the correctly encoded database. :) Thanks