Hi all.
I'm having a weird episode with JDBC connection and charSet encoding.
OS: Digital UNIX 4.0D/F
DB: PostgreSQL 7.1.2 and 7.1.3
I have created a database with "-E LATIN2" option. Then I imported a WIN1250
encoded data into it - the data was generated from a set of static HTML pages
and loading was with WIN1250 client encoding.
The data looks OK from "psql", changing client encoding yields the expected
result. I'm preety sure it is as it should be.
JDBC interface behaves in a very weird manner:
URL: jdbc:postgresql://localhost/mercury
OUT: all our alphabet specific characters are tuned into "?"
URL: jdbc:postgresql://localhost/mercury?charSet=LATIN1
OUT: I get data OK - LATIN2 encoded!!!
URL: jdbc:postgresql://localhost/mercury?charSet=LATIN2
OUT: all our alphabet specific characters are tuned into "?"
URL: jdbc:postgresql://localhost/mercury?charSet=UNICODE
OUT: JDBC connection crashes with:
Exception in thread "main" java.sql.SQLException:
at org.postgresql.Connection.ExecSQL(Connection.java, Compiled Code)
at org.postgresql.jdbc2.Statement.execute(Statement.java, Compiled Code)
at org.postgresql.jdbc2.Statement.executeQuery(Statement.java, Compiled Code)
at test2PostgreSQL.main(test2PostgreSQL.java, Compiled Code)
On the server side, PostgreSQL spits out:
ERROR: parser: parse error at or near "t?"
FATAL 1: Socket command type S unknown
(on my terminal, that "t?" looks really strange, two chars I cannot even
describe, I guess Copy/Paste changed it to "t?")
So, anyone has an idea what is going on? I can live with "charSet=LATIN1" for
the moment, but I have a nasty feeling, the data is not loaded as it should be.
Namely, I'm not sure that, for instance, "c-acsan" letter Latin-2 encoded in
PostgreSQL is really transformed into "c-acsan" Unicode encoded inside my Java
application.
Since I'm more oriented to JSP for this matter, I'll live with it, but I have an
uneasy feeling about it. I think this issue should be addressed.
PostgreSQL was built with:
--enable-locale enable locale support
--enable-recode enable character set recode support
--enable-multibyte enable multibyte character support
--enable-unicode-conversion enable unicode conversion support
TYIA,
Nix.