Jason Tesser wrote:
> 2004-10-26 16:54:51,167 ERROR [STDERR]
> org.postgresql.util.PSQLException: Invalid character data was found.
> This is most likely caused by stored data containing characters that are
> invalid for the character set the database was created in. The most
> common example of this is storing 8bit data in a SQL_ASCII database.
As the error says, this problem usually arises from storing 8 bit data
in a SQL_ASCII database..
The JDBC driver always sets client_encoding = UNICODE and expects the
data arriving from the server to be UTF8 ("unicode") encoded. When you
have a SQL_ASCII database, the server has no information as to how to
translate characters above 127 into corresponding unicode values, so it
just passes them straight out. Then JDBC complains about invalid unicode
sequences.
It's not just a case of somehow making the JDBC driver accept those
sequences; the driver really does need them translated to unicode as
Java's internal string format uses a unicode representation. To do this
translation, you need information about the actual encoding the data is
using. For post-7.2 servers, the JDBC driver chooses to let the server
deal with this, so you need to get the encoding information right on the
database side.
So you will need to recreate your database using an appropriate encoding
that reflects the data stored in it. Presumably those high-ascii
sequences already in the database are *not* unicode, they're probably
ISO-8859-1 or something similar? In that case you can probably dump&load
into database created with the LATIN1 encoding.
-O