Thread: Charset problem on WHERE clause

Charset problem on WHERE clause

From
smota
Date:
Hi,

I'm pretty new to PostgreSQL as well to it's JDBC driver.

For now I'm using PostgreSQL 7.3.6 version under Red Hat ES 3.0.
The database is created with SQL_ASCII encoding.
I'm retriving data from the database with the pg74.214.jdbc3.jar driver.

Some fields contains values with accents (characters like Ç, Ã, Õ, etc.) ...

I've set the connection string with
jdbc:postgresql://10.100.1.11:5432/mydatabase?charSet=LATIN1

On java code I must get the fields with new
String(result.getBytes(1),"ISO-8859-1") to have accentued chars
correctly displayed .....

My problem is when I get an accentued character on a WHERE expression,
it doesn't return any value.
I've tried field IN (to_char('MANUTENÇÃO', 'LATIN1')) ... but with no success.

Any idea or help on this?

On time ... using pgAdminIII or the line command psql tool both works
with the accents on WHERE clause.

Thanks

Re: Charset problem on WHERE clause

From
Kris Jurka
Date:

On Mon, 26 Jul 2004, smota wrote:

> Hi,
>
> I'm pretty new to PostgreSQL as well to it's JDBC driver.
>
> For now I'm using PostgreSQL 7.3.6 version under Red Hat ES 3.0.
> The database is created with SQL_ASCII encoding.
> I'm retriving data from the database with the pg74.214.jdbc3.jar driver.
>
> jdbc:postgresql://10.100.1.11:5432/mydatabase?charSet=LATIN1

You should not use a SQL_ASCII database.  The JDBC driver requires you
database to use a proper encoding for your data.  The ?charSet url
parameter was designed to work around this problem for <= 7.2 servers
which didn't come with multibyte encoding support compiled by default, but
it is ignored in => 7.3 servers so it is useless here.

Kris Jurka

Re: Charset problem on WHERE clause

From
Oliver Jowett
Date:
Kris Jurka wrote:

> You should not use a SQL_ASCII database.  The JDBC driver requires you
> database to use a proper encoding for your data.  The ?charSet url
> parameter was designed to work around this problem for <= 7.2 servers
> which didn't come with multibyte encoding support compiled by default, but
> it is ignored in => 7.3 servers so it is useless here.

I wonder if it's worth supporting the charSet parameter even for >= 7.3:
set client_encoding explicitly to SQL_ASCII (which I believe means "no
translation") and do the translation to Unicode on the JVM side using
whatever charset the user provided. I think most of the encoding details
are now isolated from the rest of the protocol logic, so it wouldn't be
a very invasive change.

My only concern is that it'd encourage people to keep their DBs as
SQL_ASCII .. which is just delaying the problem.

-O

Re: Charset problem on WHERE clause

From
smota
Date:
Hi,

Thanks Kris for your fast response .... I've recreated the database
using LATIN1 and everything is working just fine.

> I wonder if it's worth supporting the charSet parameter even for >= 7.3:
> set client_encoding explicitly to SQL_ASCII (which I believe means "no
> translation") and do the translation to Unicode on the JVM side using
> whatever charset the user provided. I think most of the encoding details
> are now isolated from the rest of the protocol logic, so it wouldn't be
> a very invasive change.

I really don't see this as a good idea :O ....

With the LATIN1 charset I didn't use Java conversion (with new
String(rs.getBytes(1)....etc. etc.) my code was cleaner and easier,
so, It was a good thing make me use the correctly encoded database.

:)

Thanks