Encoding weirdness with JDBC, driver crashing? - Mailing list pgsql-jdbc

From Nikola Milutinovic
Subject Encoding weirdness with JDBC, driver crashing?
Date
Msg-id 3BE4098F.7030303@ev.co.yu
Whole thread Raw
List pgsql-jdbc
Hi all.

I'm having a weird episode with JDBC connection and charSet encoding.

OS: Digital UNIX 4.0D/F
DB: PostgreSQL 7.1.2 and 7.1.3

I have created a database with "-E LATIN2" option. Then I imported a WIN1250
encoded data into it - the data was generated from a set of static HTML pages
and loading was with WIN1250 client encoding.

The data looks OK from "psql", changing client encoding yields the expected
result. I'm preety sure it is as it should be.

JDBC interface behaves in a very weird manner:

URL: jdbc:postgresql://localhost/mercury
OUT: all our alphabet specific characters are tuned into "?"

URL: jdbc:postgresql://localhost/mercury?charSet=LATIN1
OUT: I get data OK - LATIN2 encoded!!!

URL: jdbc:postgresql://localhost/mercury?charSet=LATIN2
OUT: all our alphabet specific characters are tuned into "?"

URL: jdbc:postgresql://localhost/mercury?charSet=UNICODE
OUT: JDBC connection crashes with:

Exception in thread "main" java.sql.SQLException:
  at org.postgresql.Connection.ExecSQL(Connection.java, Compiled Code)
  at org.postgresql.jdbc2.Statement.execute(Statement.java, Compiled Code)
  at org.postgresql.jdbc2.Statement.executeQuery(Statement.java, Compiled Code)
  at test2PostgreSQL.main(test2PostgreSQL.java, Compiled Code)

On the server side, PostgreSQL spits out:

ERROR:  parser: parse error at or near "t?"
FATAL 1:  Socket command type S unknown

(on my terminal, that "t?" looks really strange, two chars I cannot even
describe, I guess Copy/Paste changed it to "t?")

So, anyone has an idea what is going on? I can live with "charSet=LATIN1" for
the moment, but I have a nasty feeling, the data is not loaded as it should be.
Namely, I'm not sure that, for instance, "c-acsan" letter Latin-2 encoded in
PostgreSQL is really transformed into "c-acsan" Unicode encoded inside my Java
application.

Since I'm more oriented to JSP for this matter, I'll live with it, but I have an
uneasy feeling about it. I think this issue should be addressed.

PostgreSQL was built with:

--enable-locale              enable locale support
--enable-recode              enable character set recode support
--enable-multibyte           enable multibyte character support
--enable-unicode-conversion  enable unicode conversion support

TYIA,
Nix.


pgsql-jdbc by date:

Previous
From: Jason Davies
Date:
Subject: Re: [jason@netspade.com: DatabaseMetaData.java.diff]
Next
From: Rene Pijlman
Date:
Subject: Funny timezone shift causes failure in test suite