Re: Charset encoding patch to JDBC driver - Mailing list pgsql-jdbc

From Javier Yáñez
Subject Re: Charset encoding patch to JDBC driver
Date
Msg-id 423952E9.4010308@cibal.es
Whole thread Raw
In response to Re: Charset encoding patch to JDBC driver  (Oliver Jowett <oliver@opencloud.com>)
Responses Re: Charset encoding patch to JDBC driver  (Oliver Jowett <oliver@opencloud.com>)
List pgsql-jdbc
Oliver Jowett wrote:

> I'm uncomfortable with applying this sort of patch to the official
> driver, since it makes the driver more complex just to handle what is
> arguably a database misconfiguration. It also introduces a new class of
> error: a mismatch between the driver's configured charSet and the actual
> database.

    I think that this patch is necessary to resolve some problems of the
real life. In my particular case I have to make a j2ee application to
access a existing database. This database is SQL-ASCII  encoding, with
the actual version of pgjdbc when the result of a query contains a 8
bits character (very common in Spanish) appears this error:

org.postgresql.util.PSQLException: Invalid character data was found.
This is most likely caused by stored data containing characters that are
invalid for the character set the database was created in.  The most
common example of this is storing 8bit data in a SQL_ASCII database.

Many people has similar problems:

http://www.google.es/search?q=%22Invalid+character+data+was+found%22&hl=es&lr=&start=10&sa=N

http://linux.kieser.net/java_pg_unicode.html

    I can not say to my customer that changes the database encoding
because other applications (non-java) could not work or show strange
characters.

    By other hand, I do not think that to use SQL-ASCII encoding is a
database misconfiguration. I do not think that storing 8bit data in a
SQL_ASCII database is incorrect. Others applications are using the same
database with ODBC without problem.


> Comments on the patch itself:
>
> - it is missing changes to the v2 protocol path

I have not proven it, but I think that the v2 protocol has the
functionality of choose the encoding.

> - why does it remove the client_encoding sanity check on connect?

my intention was to remove the verification of client_encoding is equals
to UNICODE. I agree with to check the client_encoding.

> - since encoding does not change for the lifetime of the connection,
> can't you make the encoding a field of QueryExecutoryImpl rather than
> passing it around everywhere?

I agree.

> - it may be better to pass encoding as a parameter to
> SimpleParameterList methods that need it, rather than storing the (same)
> value on every list instance.

I agree too. The encoding object only is used in 2 methods.


I'm going to try to improve the patch and post it.


Thank you for your time!


Javier Yáñez

--
CIBAL Multimedia S.L.
Edificio 17, C-10
ParcBIT
Camino de Can Manuel s/n
07120 - Palma de Mallorca
Spain


pgsql-jdbc by date:

Previous
From: Oliver Jowett
Date:
Subject: Re: invalid string enlargement request
Next
From: jonathan.lister@vaisala.com
Date:
Subject: Re: Cannot Retrieve Binary Data