Re: Problems with charsets, investigated... - Mailing list pgsql-jdbc
From | Alexandre Aufrere |
---|---|
Subject | Re: Problems with charsets, investigated... |
Date | |
Msg-id | 20040806183208.75F47400E5@smtp.ies.inet6.fr Whole thread Raw |
In response to | Problems with charsets, investigated... (Alexandre Aufrere <alexandre.aufrere@inet6.fr>) |
List | pgsql-jdbc |
Well, no, actually i want to use LATIN1/ISO-8859-1 everywhere. So my appserver should get ISO-8859-1 string from the driver, and not UTF-8. Why ? because we have a lot bunch of apps developped in ISO-8859-1, and as well a lot of data in LATIN1, and it's out of question to put everything in UTF-8/UNICODE. For me, the driver should get strings encoded accordingly to the system properties of the JVM it is run in. Or at least there should be a way to tell the driver what charset to use. In other means, the current behaviour is precisely NOT transparent to me, because i end up with a database in LATIN1, whose data are converted in UTF-8 before i retrieve them from the JDBC driver, which 1) would give me more work to convert back to ISO-8859-1, and 2) would not be backward compatible (meaning have to test again a LOT of apps to check we're breaking nothing). So my hack just gets the file.encoding java system property, and requests data to the postgresql server and handle it accordingly (namely if file.encoding is ISO-8859-1, it requests LATIN1, and handles everything it gets in ISO-8859-1). Now, IMHO, ideally, the default behaviour of the JDBC driver should be to get the encoding from pg_database table, and deduce what encoding to use for the strings. And of course, there should be an easy way to change that for people who want it other way. I don't know how exactly it was working in previous versions, the fact is that with LANG environment variables set everywhere to en_US.ISO-8859-1 and encoding in pg_database set to 8 (LATIN1), it just worked (we are using postgresql+java+Enhydra for a long long time). Any change in that that would involve us having to handle the charsets explicitly might be "ideally" right, but is not backward compatible and will cause us a lot of problems (and i'm quite sure not only to us). Lastly, it's highly possible that i didn't see something somewhere, so i apologize in advance for being utterly dumb ;-) Regards, Alexandre Aufrere ---------------------------------------------------- De : Kris Jurka <books@ejurka.com> A : Alexandre Aufrere <alexandre.aufrere@inet6.fr> Objet : Re: [JDBC] Problems with charsets, investigated... Date : Fri, 6 Aug 2004 11:05:54 -0500 (EST) > > > On Fri, 6 Aug 2004, Alexandre Aufrere wrote: > > > Java correctly sets its file.encoding property to the charset specified > > in the LANG environment variable. However, it appears that whatever i > > set this variable to, the JDBC driver seems to use UTF-8. > > > > I'm not sure what problem or issue you think this is addressing, but it is > not something we want to do. The driver communicates with the server > using UTF-8, so you should not be adjusting this and it is entirely > transparent to the user. What you do after retrieving data is your > business and you are welcome to save it or display it in any encoding you > desire, but the driver wants to communicate with the server using UTF-8. > > Kris Jurka > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
pgsql-jdbc by date: