Thread: pg jdbc driver

pg jdbc driver

From
Dino Nardini
Date:
Hey folks,

I'm currently evaluating an upgrade path from our current ColdFusion 5 /
PostgreSQL setup to ColdFusion MX / PostgreSQL.  In the current setup,
we're using the Merant ODBC driver for postgres.

I've installed Fedora Core 2 with Apache 2 and ColdFusion MX, and I grabbed
the latest JDBC driver from the postgres website.

I copied one of our production websites over to the test box, but ran into
problems with display of non-standard english characters.  This all
displays fine on our production servers running CF5 and the Merant ODBC
driver.  We have several bilingual (french/english) websites, so this would
be a show stopper for us.

I created a JDBC datasource to the production database, so the only thing
that's changed is the driver (Merant ODBC -> PostgreSQL JDBC) and the
application server (ColdFusion 5 -> ColdFusion MX).  I'm trying to narrow
down the culprit.  Is there any chance that the JDBC driver could be the
problem here?  Perhaps I'm missing something...

Cheers and thanks... Dino

_____________________________________
Rivendell Software - Dynamic Web Solutions
http://www.rivendellsoftware.com
Tel 902.461.1697
Fax 902.461.3765



Re: pg jdbc driver

From
Kris Jurka
Date:

On Sat, 7 Aug 2004, Dino Nardini wrote:

> [Could the JDBC driver be causing the problems I'm seeing with my
> website after moving from ColdFusion with ODBC? ]

Unlikely.  A common problem people find when using PostgreSQL's JDBC
driver is that they have setup their database with SQL_ASCII encoding
which doesn't have any information on what high-bit characters actually
represent.  Many applications are OK with this and blindly pass data back
and forth, but Java requires having correctly encoded data and the JDBC
driver will error out with something like "Invalid character data was
found..."  You reported problems correctly displaying this data, not an
error, so I doubt that is the issue.

As I mentioned Java is (sometimes painfully) encoding aware.  I'm not
familar with ColdFusion, but you likely need to set the page encoding
correctly for converting Java Strings into bytes.  This can be
accomplished in a number of ways, for example set globally via the
file.encoding system property or on per page basis via methods in
javax.servlet.ServletResponse such as setCharacterEncoding,
setContentType, and setLocale.

Kris Jurka

Re: pg jdbc driver

From
Kris Jurka
Date:

On Sun, 8 Aug 2004, Dino Nardini wrote:

> However, when I tested the same data with ColdFusion MX (built on a Java
> platform), I ran into problems.  The french characters were either throwing
> "Invalid character..." errors, or displaying as garbage.  I'm guessing that
> CFMX was trying to convert the characters from SQL_ASCII to something else
> (UTF-8?) somewhere between the database and the web page.

The invalid character data errors are coming from the JDBC driver.  When
it connects to the database it asks the server to send it data using UTF-8
regardless of the actual database encoding.  The server cannot convert
SQL_ASCII to UTF-8 and simply returns the SQL_ASCII data.  The JDBC driver
is expected UTF-8 data and chokes on this, so as you mentioned you will
need to dump and restore into a database with a "real" encoding.  This
doesn't necessarily have to be unicode, but often makes the most sense.

Kris Jurka

Re: pg jdbc driver

From
Kris Jurka
Date:

On Mon, 9 Aug 2004, Dino Nardini wrote:

> Hello there,
>
> Is there no way to force the JDBC driver to simply pass the data as
> SQL-ASCII, or ignore the encoding altogether?

No because Java itself needs it correctly encoded the JDBC driver needs to
make that happen.

> I tried to dump and restore the database with UNICODE encoding but get
> errors on restore because there is quotes/double quotes within some of the
> text data.  Is there an encoding that will handle both french language
> characters and quotes/double quotes?
>

I've seen this problem with Microsoft "smart" quotes that are angled, but
there should be no problem with plain ' and ".  When doing the dump /
restore you may need to actually convert the data from whatever encoding
you were using with something like iconv.  If you are just struggling with
the quotes perhaps a simple search and replace would be fine.

Kris Jurka

Re: pg jdbc driver

From
Dino Nardini
Date:
Yes, I have the database encoding set to SQL_ASCII, which hasn't been a
problem with ColdFusion 5.  As you say, the server has been blindly passing
the data back and forth.

However, when I tested the same data with ColdFusion MX (built on a Java
platform), I ran into problems.  The french characters were either throwing
"Invalid character..." errors, or displaying as garbage.  I'm guessing that
CFMX was trying to convert the characters from SQL_ASCII to something else
(UTF-8?) somewhere between the database and the web page.

I've managed to get CF5 installed on the new server for now, but I plan to
do some more testing with CFMX.  Probably have to dump the database,
re-create with UNICODE encoding, and restore.

I suspected that the problem originated with encoding at the CFMX level,
but I wanted to cover all bases.

Thanks for the feedback.

Cheers... Dino

At 07:14 PM 08/08/2004 -0500, you wrote:


>On Sat, 7 Aug 2004, Dino Nardini wrote:
>
> > [Could the JDBC driver be causing the problems I'm seeing with my
> > website after moving from ColdFusion with ODBC? ]
>
>Unlikely.  A common problem people find when using PostgreSQL's JDBC
>driver is that they have setup their database with SQL_ASCII encoding
>which doesn't have any information on what high-bit characters actually
>represent.  Many applications are OK with this and blindly pass data back
>and forth, but Java requires having correctly encoded data and the JDBC
>driver will error out with something like "Invalid character data was
>found..."  You reported problems correctly displaying this data, not an
>error, so I doubt that is the issue.
>
>As I mentioned Java is (sometimes painfully) encoding aware.  I'm not
>familar with ColdFusion, but you likely need to set the page encoding
>correctly for converting Java Strings into bytes.  This can be
>accomplished in a number of ways, for example set globally via the
>file.encoding system property or on per page basis via methods in
>javax.servlet.ServletResponse such as setCharacterEncoding,
>setContentType, and setLocale.
>
>Kris Jurka

_____________________________________
Rivendell Software - Dynamic Web Solutions
http://www.rivendellsoftware.com
Tel 902.461.1697
Fax 902.461.3765



Re: pg jdbc driver

From
Dino Nardini
Date:
Hello there,

Is there no way to force the JDBC driver to simply pass the data as
SQL-ASCII, or ignore the encoding altogether?

I tried to dump and restore the database with UNICODE encoding but get
errors on restore because there is quotes/double quotes within some of the
text data.  Is there an encoding that will handle both french language
characters and quotes/double quotes?

Cheers... Dino


At 08:23 PM 08/08/2004 -0500, Kris Jurka wrote:


>On Sun, 8 Aug 2004, Dino Nardini wrote:
>
> > However, when I tested the same data with ColdFusion MX (built on a Java
> > platform), I ran into problems.  The french characters were either
> throwing
> > "Invalid character..." errors, or displaying as garbage.  I'm guessing
> that
> > CFMX was trying to convert the characters from SQL_ASCII to something else
> > (UTF-8?) somewhere between the database and the web page.
>
>The invalid character data errors are coming from the JDBC driver.  When
>it connects to the database it asks the server to send it data using UTF-8
>regardless of the actual database encoding.  The server cannot convert
>SQL_ASCII to UTF-8 and simply returns the SQL_ASCII data.  The JDBC driver
>is expected UTF-8 data and chokes on this, so as you mentioned you will
>need to dump and restore into a database with a "real" encoding.  This
>doesn't necessarily have to be unicode, but often makes the most sense.
>
>Kris Jurka

_____________________________________
Rivendell Software - Dynamic Web Solutions
http://www.rivendellsoftware.com
Tel 902.461.1697
Fax 902.461.3765