Oliver Jowett <oliver@opencloud.com> writes:
> The problem, as I understand it from Tom's explanation, is that
> SQL_ASCII only works if everyone is using the same client_encoding; the
> server has no knowledge of the real underlying encoding of the data so
> can't do conversions.
Not only can the server not do conversions, but it cannot count string
lengths "correctly" in strings that are really in a multibyte encoding.
When JDBC sends a UTF8 string that contains some non-ASCII characters,
the server can store the string safely, but it cannot operate on it
in any intelligent way.
I wonder whether the JDBC driver ought to warn about it if it sees
server_encoding == SQL_ASCII? You're certainly just asking for trouble
to use JDBC with such a setting.
> JDBC always uses a client_encoding of UNICODE. I don't know what ODBC
> does, but apparently it's not using UNICODE.
ODBC is probably just passing through the client data as-is, and not
doing anything at all with the encoding settings.
> Perhaps one option is to set the database encoding to UNICODE, and
> either get the ODBC driver to issue an appropriate "SET client_encoding"
> on connection setup (I don't know if ODBC lets you do this) or set the
> default client_encoding in postgresql.conf to whatever is appropriate
> for ODBC clients?
That would work. Plan B would be to set the database encoding to
whatever the ODBC clients are using, and let encoding conversions happen
when talking to a JDBC client.
The one thing that is absolutely, positively guaranteed not to work is
setting the DB encoding to SQL_ASCII. That defeats any chance you have
of getting intelligent encoding behavior from the system.
regards, tom lane