Home > mailing lists

Re: Very strange Error in Updates - Mailing list pgsql-jdbc

From	Dario V. Fassi
Subject	Re: Very strange Error in Updates
Date	July 15, 2004 13:30:59
Msg-id	40F6B35E.4010600@sistemat.com.ar Whole thread Raw
In response to	Re: Very strange Error in Updates (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Very strange Error in Updates Re: Very strange Error in Updates
List	pgsql-jdbc

Tree view

My problem it's that the data is just inside the postgresql server (with SQL_ASCII encoding), inserted by Win32/ODBC clients.

Now from JDBC I can't handle any row with any field that has one o more 8 bits characters.
At same time , Win32/ODBC programs continue to use it without any problem.
This situation let me in a hard to explain situation.

One more question, using the PreparedStatement.setBytes() , can be done the treatment that ODBC does with that fields ?
Thanks all for your help.
Dario.

Tom Lane wrote:

Oliver Jowett <oliver@opencloud.com> writes:

The JDBC driver always speaks UNICODE when it can, since that matches 
Java's internal string representation. I suspect that what's happening is:

0) the driver sets client_encoding = UNICODE during connection setup

Right.

1) the driver encodes the parameter as UNICODE (== UTF8); for characters 
above 127 this encoding will result in more than one byte per character.

Right.

2) the server converts from client_encoding UNICODE to database encoding 
SQL_ASCII; for characters that are invalid in SQL_ASCII (>127) it does 
some arbitary conversion, probably just copying the illegal values 
unchanged.

Not really.  SQL_ASCII encoding basically means "we don't know what this
data is, just store it verbatim".  So the UTF-8 string sent by the
driver is stored verbatim.

3) you end up with extra characters in the resulting value which exceeds 
the varchar's size.

Right.  Since the server does not know what encoding is in use, it falls
back to the assumption that 1 character == 1 byte, under which
assumption the string violates the varchar(30) constraint.

Had the server known which encoding was in use, it would have counted
the characters correctly.

The solution is to use a database encoding that matches your data.

Actually, if you intend to access the database primarily through JDBC,
it'd be best to use server encoding UNICODE.  The JDBC driver will
always want UNICODE on the wire, and I see no reason to force extra
character set conversions.  Non-UNICODE-aware clients can be handled by
setting client_encoding properly.
		regards, tom lane

pgsql-jdbc by date:

From: Tom Lane
Date: 15 July 2004, 11:16:14
Subject: Re: Very strange Error in Updates

From: Kris Jurka
Date: 15 July 2004, 15:18:40
Subject: Re: SSL Problem

Re: Very strange Error in Updates - Mailing list pgsql-jdbc

Previous

Next