Thread: Character Encoding Confusion
Hi! We've been using PostgreSQL (currently 7.4) via ODBC with ColdFusion until now and didn't experience any problems. But now we want to move from ColdFusion 4.5 to ColdFusion MX, thus abandoning ODBC and migrating to JDBC. As ODBC seems to be blissfully unaware of any character encodings whatsoever, so were we - our databases are encoded in SQL_ASCII, although we have stored german special chars (ÄÖÜäöü and ß), and from what I have read so far, these are stored as multibyte and thus exceed the SQL-ASCII specification. With ODBC we never noticed the mistake we'd made. Now with JDBC/ColdFusion MX 6.1, we see all sorts of weird characters on our web-application, but not the ones which are stored in the database. I tried setting different character sets for the JDBC-driver, using the URL-syntax jdbc:postgresql://123.456.789.012:5432/database?charSet=characterSet with charSet=iso-8859-1 or charSet=UTF-8 for example, but that just change anything. Now is there some way to elegantly resolve the issue without dropping and recreating the databases in order to change the encoding? Can we somehow get the JDBC-driver to act just as the ODBC-driver did - silently passing on the "bad" characters without changing anything? And if there is just no way to avoid that, what's the correct procedure for changing the encoding anyway? How would I be able to migrate the current data without any data-loss and with the least possible downtime? Kind regards Markus
On Mon, 8 Mar 2004, Markus Wollny wrote: > Hi! > > As ODBC seems to be blissfully unaware of any character encodings > whatsoever, so were we - our databases are encoded in SQL_ASCII, > although we have stored german special chars (ÄÖÜäöü and ß), and from > what I have read so far, these are stored as multibyte and thus exceed > the SQL-ASCII specification. > > With ODBC we never noticed the mistake we'd made. Now with > JDBC/ColdFusion MX 6.1, we see all sorts of weird characters on our > web-application, but not the ones which are stored in the database. > > I tried setting different character sets for the JDBC-driver, using the > URL-syntax > jdbc:postgresql://123.456.789.012:5432/database?charSet=characterSet > with charSet=iso-8859-1 or charSet=UTF-8 for example, but that just > change anything. > > Now is there some way to elegantly resolve the issue without dropping > and recreating the databases in order to change the encoding? Can we > somehow get the JDBC-driver to act just as the ODBC-driver did - > silently passing on the "bad" characters without changing anything? > The JDBC driver needs the data encoded correctly, the ?charSet= option only works on 7.2 and earlier databases because then multibyte was not compiled in by default. This will require a dump and reload. Kris Jurka
Markus Wollny wrote: > As ODBC seems to be blissfully unaware of any character encodings > whatsoever, so were we - our databases are encoded in SQL_ASCII, > although we have stored german special chars (ÄÖÜäöü and ß), and from > what I have read so far, these are stored as multibyte and thus > exceed the SQL-ASCII specification. SQL_ASCII is not a real encoding, it simply means to pass bytes through without looking at them. If you want to get sensible behavior with German characters, you should use LATIN9 as the server encoding.