Thread: pg jdbc driver
Hey folks, I'm currently evaluating an upgrade path from our current ColdFusion 5 / PostgreSQL setup to ColdFusion MX / PostgreSQL. In the current setup, we're using the Merant ODBC driver for postgres. I've installed Fedora Core 2 with Apache 2 and ColdFusion MX, and I grabbed the latest JDBC driver from the postgres website. I copied one of our production websites over to the test box, but ran into problems with display of non-standard english characters. This all displays fine on our production servers running CF5 and the Merant ODBC driver. We have several bilingual (french/english) websites, so this would be a show stopper for us. I created a JDBC datasource to the production database, so the only thing that's changed is the driver (Merant ODBC -> PostgreSQL JDBC) and the application server (ColdFusion 5 -> ColdFusion MX). I'm trying to narrow down the culprit. Is there any chance that the JDBC driver could be the problem here? Perhaps I'm missing something... Cheers and thanks... Dino _____________________________________ Rivendell Software - Dynamic Web Solutions http://www.rivendellsoftware.com Tel 902.461.1697 Fax 902.461.3765
On Sat, 7 Aug 2004, Dino Nardini wrote: > [Could the JDBC driver be causing the problems I'm seeing with my > website after moving from ColdFusion with ODBC? ] Unlikely. A common problem people find when using PostgreSQL's JDBC driver is that they have setup their database with SQL_ASCII encoding which doesn't have any information on what high-bit characters actually represent. Many applications are OK with this and blindly pass data back and forth, but Java requires having correctly encoded data and the JDBC driver will error out with something like "Invalid character data was found..." You reported problems correctly displaying this data, not an error, so I doubt that is the issue. As I mentioned Java is (sometimes painfully) encoding aware. I'm not familar with ColdFusion, but you likely need to set the page encoding correctly for converting Java Strings into bytes. This can be accomplished in a number of ways, for example set globally via the file.encoding system property or on per page basis via methods in javax.servlet.ServletResponse such as setCharacterEncoding, setContentType, and setLocale. Kris Jurka
On Sun, 8 Aug 2004, Dino Nardini wrote: > However, when I tested the same data with ColdFusion MX (built on a Java > platform), I ran into problems. The french characters were either throwing > "Invalid character..." errors, or displaying as garbage. I'm guessing that > CFMX was trying to convert the characters from SQL_ASCII to something else > (UTF-8?) somewhere between the database and the web page. The invalid character data errors are coming from the JDBC driver. When it connects to the database it asks the server to send it data using UTF-8 regardless of the actual database encoding. The server cannot convert SQL_ASCII to UTF-8 and simply returns the SQL_ASCII data. The JDBC driver is expected UTF-8 data and chokes on this, so as you mentioned you will need to dump and restore into a database with a "real" encoding. This doesn't necessarily have to be unicode, but often makes the most sense. Kris Jurka
On Mon, 9 Aug 2004, Dino Nardini wrote: > Hello there, > > Is there no way to force the JDBC driver to simply pass the data as > SQL-ASCII, or ignore the encoding altogether? No because Java itself needs it correctly encoded the JDBC driver needs to make that happen. > I tried to dump and restore the database with UNICODE encoding but get > errors on restore because there is quotes/double quotes within some of the > text data. Is there an encoding that will handle both french language > characters and quotes/double quotes? > I've seen this problem with Microsoft "smart" quotes that are angled, but there should be no problem with plain ' and ". When doing the dump / restore you may need to actually convert the data from whatever encoding you were using with something like iconv. If you are just struggling with the quotes perhaps a simple search and replace would be fine. Kris Jurka
Yes, I have the database encoding set to SQL_ASCII, which hasn't been a problem with ColdFusion 5. As you say, the server has been blindly passing the data back and forth. However, when I tested the same data with ColdFusion MX (built on a Java platform), I ran into problems. The french characters were either throwing "Invalid character..." errors, or displaying as garbage. I'm guessing that CFMX was trying to convert the characters from SQL_ASCII to something else (UTF-8?) somewhere between the database and the web page. I've managed to get CF5 installed on the new server for now, but I plan to do some more testing with CFMX. Probably have to dump the database, re-create with UNICODE encoding, and restore. I suspected that the problem originated with encoding at the CFMX level, but I wanted to cover all bases. Thanks for the feedback. Cheers... Dino At 07:14 PM 08/08/2004 -0500, you wrote: >On Sat, 7 Aug 2004, Dino Nardini wrote: > > > [Could the JDBC driver be causing the problems I'm seeing with my > > website after moving from ColdFusion with ODBC? ] > >Unlikely. A common problem people find when using PostgreSQL's JDBC >driver is that they have setup their database with SQL_ASCII encoding >which doesn't have any information on what high-bit characters actually >represent. Many applications are OK with this and blindly pass data back >and forth, but Java requires having correctly encoded data and the JDBC >driver will error out with something like "Invalid character data was >found..." You reported problems correctly displaying this data, not an >error, so I doubt that is the issue. > >As I mentioned Java is (sometimes painfully) encoding aware. I'm not >familar with ColdFusion, but you likely need to set the page encoding >correctly for converting Java Strings into bytes. This can be >accomplished in a number of ways, for example set globally via the >file.encoding system property or on per page basis via methods in >javax.servlet.ServletResponse such as setCharacterEncoding, >setContentType, and setLocale. > >Kris Jurka _____________________________________ Rivendell Software - Dynamic Web Solutions http://www.rivendellsoftware.com Tel 902.461.1697 Fax 902.461.3765
Hello there, Is there no way to force the JDBC driver to simply pass the data as SQL-ASCII, or ignore the encoding altogether? I tried to dump and restore the database with UNICODE encoding but get errors on restore because there is quotes/double quotes within some of the text data. Is there an encoding that will handle both french language characters and quotes/double quotes? Cheers... Dino At 08:23 PM 08/08/2004 -0500, Kris Jurka wrote: >On Sun, 8 Aug 2004, Dino Nardini wrote: > > > However, when I tested the same data with ColdFusion MX (built on a Java > > platform), I ran into problems. The french characters were either > throwing > > "Invalid character..." errors, or displaying as garbage. I'm guessing > that > > CFMX was trying to convert the characters from SQL_ASCII to something else > > (UTF-8?) somewhere between the database and the web page. > >The invalid character data errors are coming from the JDBC driver. When >it connects to the database it asks the server to send it data using UTF-8 >regardless of the actual database encoding. The server cannot convert >SQL_ASCII to UTF-8 and simply returns the SQL_ASCII data. The JDBC driver >is expected UTF-8 data and chokes on this, so as you mentioned you will >need to dump and restore into a database with a "real" encoding. This >doesn't necessarily have to be unicode, but often makes the most sense. > >Kris Jurka _____________________________________ Rivendell Software - Dynamic Web Solutions http://www.rivendellsoftware.com Tel 902.461.1697 Fax 902.461.3765