Re: UNICODE encoding and jdbc related issues - Mailing list pgsql-jdbc
From | Igor Postelnik |
---|---|
Subject | Re: UNICODE encoding and jdbc related issues |
Date | |
Msg-id | 46F30BC04EC6364695BC07D4A57AAD2C7C7EAB@auscorpex-1.austin.messageone.com Whole thread Raw |
In response to | UNICODE encoding and jdbc related issues (Chris Kratz <chris.kratz@vistashare.com>) |
Responses |
Re: UNICODE encoding and jdbc related issues
|
List | pgsql-jdbc |
> > 2. I'm really not sure I want to change the encoding of our main > database to > > Unicode. Is there a performance loss when going to a UNICODE database > > encoding? What about sorts, etc. I'm really worried about unintended > side > > effects of moving from SQL_ASCII to UNICODE. > > You don't need to use unicode, but you must select another encoding. If > you'd like to stick with a single byte encoding perhaps LATIN1 would be > appropriate for you. I've asked this before on the performance list but didn't get any reply. Is there substantial performance difference between using SQL_ASCII, LATIN1, or UNICODE? > The driver does SET client_encoding which does work for all real server > encodings. The problem is that SQL_ASCII is not a real encoding. It > accepts any encoding and cannot do conversions to other encodings. Your > db right now could easily have a mix of encodings. ISTM that when you create a database with SQL_ASCII encoding you decide to deal with character set issues in the applications. Why is the JDBC driver dictating how the application handles character set issues? -Igor > -----Original Message----- > From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc- > owner@postgresql.org] On Behalf Of Kris Jurka > Sent: Wednesday, April 06, 2005 1:23 PM > To: Chris Kratz > Cc: pgsql-jdbc@postgresql.org > Subject: Re: [JDBC] UNICODE encoding and jdbc related issues > > > > On Wed, 6 Apr 2005, Chris Kratz wrote: > > > Our production database was created with the default SQL_ASCII encoding. > It > > appears that some of our users have entered characters into the system > with > > characters above 127 (accented vowels, etc). None of the tools we use > > currently have had a problem with this behavior until recently, > everything > > just worked. > > > > I was testing some reporting tools this past weekend and have been > playing > > with Jasper reports[1] . Jasper reports is a Java based reporting tool > that > > reads data from the database via JDBC. When I initially tried to > generate > > reports, the jdbc connection would crash with the following message: > > > > org.postgresql.util.PSQLException: Invalid character data was found. > > > > Googling eventually turned up a message on the pgsql-jdbc list detailing > the > > problem[2]. Basically, java cannot convert these characters above 127 > into > > unicode which is required by java. > > > > After some more googling, I found that if I took a recent database dump > and > > then ran it through iconv[3] and then created the database with a > unicode > > encoding, everything worked. > > > > 1. Is there any way to do a iconv type translation inline in a sql > statement? > > ie select translate(text_field, unicode) from table.... Btw, set > > client_encoding=UNICODE does not work in this situation. In fact the > JDBC > > driver for postgres seems to do this automatically. > > You can't do translation inline, how would a driver interpret the results > of SELECT translate(field1, unicode), translate(field2, latin1) ? > > > > > > 3. Is there any other way around this issue? Or are we living > dangerously by > > trying to store non-ascii data in a database created as ascii encoded? > > You are living dangerously. > > > 4. Has anyone else gone through a conversion like this? Are there any > > gotchas we should look out for? > > The gotchas here are to make sure your other client tools still work > against the new database. > > > [3] iconv -f iso8859-1 -t utf-8 < dbsnapshot.dumpall > dump-utf- > 8.dumpall > > I see your data really is LATIN1. Perhaps you should use that as your db > encoding. That should keep your existing client tools happy as well as > the JDBC driver. > > Kris Jurka > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
pgsql-jdbc by date: