Re: UTF8 - Mailing list pgsql-jdbc

From Oliver Jowett
Subject Re: UTF8
Date
Msg-id 447FF47C.5000901@opencloud.com
Whole thread Raw
In response to Re: UTF8  (Markus Schaber <schabi@logix-tt.com>)
Responses Re: UTF8  (Markus Schaber <schabi@logix-tt.com>)
List pgsql-jdbc
Markus Schaber wrote:
> Hi, Bakos,
>
> Bakos Sandor wrote:
>
>
>>I get the following exception when I read a simple TXT file in Linux and
>>try to INSERT to the psql. (8.1.4)
>>
>>org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding
>>"UTF8" has no equivalent in "LATIN2"
>
>
> This meas that your database is encoded in ISO-LATIN2 charset, and psql
> is telling the server the data it sends is UTF-8. The server tries to
> convert the UTF-8 Data into LATIN2, but there is a character (whose
> UTF8-Sequence is 0xefbfbd) that is not contained in LATIN-2.
>
> Either your file is latin-2 in reality (or even another charset), then
> you should tell psql to use the latin-2 encoding.
>
> Or your file really is utf-8, and really contains characters not
> contained in latin-2. Then you have two possibilities: Edit the file and
> replace those characters with some transcription, or convert your
> database to utf-8 encoding (needs a dump&restore).

Actually, given that that's a Java JDBC exception, there's no 'psql'
client involved at all.

The JDBC driver always uses UTF8 as the client encoding since that maps
easily from the native Java string representation (UCS2) and every
possible Java String can be represented in UTF8. Of course, not every
possible Java string can be represented as LATIN2, which is the cause of
the error.

I would guess that the problem is probably that when *reading* the text
file originally, the wrong encoding is being used to convert the bytes
to Java Strings. If you don't use the right encoding here, then the Java
String you end up with will be garbage.

-O

pgsql-jdbc by date:

Previous
From: Markus Schaber
Date:
Subject: Re: UTF8
Next
From: Markus Schaber
Date:
Subject: Re: UTF8