Re: Charset encoding and accents - Mailing list pgsql-hackers

From Thomas O'Dowd
Subject Re: Charset encoding and accents
Date
Msg-id 1049977197.1483.90.camel@beast.uwillsee.com
Whole thread Raw
In response to Charset encoding and accents  (Davide Romanini <romaz@libero.it>)
Responses Re: Charset encoding and accents  (Barry Lind <blind@xythos.com>)
List pgsql-hackers
Davide,

ASCII implies 7-bit characters which is doesn't have enough information
to store the accented characters that you are using. I'm confused as to
how they are being stored in the database at all if this is the case. I
presume it gets stored as the 8th bit is there anyway by default, but
that shouldn't really be expected me thinks.

Your database should probably be using LATIN1 (ISO-8859-1) or some other
8 bit encoding if you really want to store 8 bit information in it.

Anyway, try connecting with:

jdbc:postgresql://localhost/prova?charSet=LATIN1

This might well work for you. That said I haven't tried this nor dug
into the internals of the java driver in a while. I'll Cc the jdbc list.

Tom.

On Thu, 2003-04-10 at 18:04, Davide Romanini wrote:
> Hi,
>
> I've posted this problem two times in the pgsql-jdbc user list, but no
> one helped me to solve it. I think this is a really serious problem in
> the jdbc driver. I've tried different solutions with no result.
>
> Well, let me explain the problem. I've a currently working database in
> PostgreSQL. There's an application, written in M$ Access, that uses the
> database through the ODBC driver with no problems. I'd want to access
> the data using a Swing application through the jdbc driver.
> At server side the charset encoding is set as SQL_ASCII. It is not a
> problem because all the strings containing accented characters are
> retrived correctly by ODBC and also the psql client.
> But if I retrive strings containing accents (like àòù) using jdbc I get
> in trouble because my accents get dirty. For example: the string 'La
> città di Forlì' is retrived and displayed as 'La citt?di Forl?'!
>
> I've worked a bit around the problem with the source code of the driver.
> I notice that when I call rs.getString(), the driver invokes (at a
> certain point) the method org.postgresql.core.Encoding.decode(byte[]
> encodedString, int offset, int length).
> This method calls the decodeUTF8 when the actual encoding equals to
> "UTF-8". If the encoding is different, it simply returns a new
> String(encodedString, offset, length, encoding).
> Well, my database is SQL_ASCII, so the jdbc driver should return a new
> string and not call decodeUTF8. But when I do a step by step debug into
> the source, the encoding ALWAYS equals to UTF-8! I've also tried to set
> a parameter in my connection string:
> jdbc:postgresql://localhost/prova?charSet=SQL_ASCII (I've tried a lot of
> different encodings here). The encoding is always UTF-8.
> Well, I thought 'if the driver wants strings to be UNICODE, set up the
> server variable CLIENT_ENCODING to UNICODE'. No result! It doesn't change!
> The only way to have my string displayed correctly is to comment out all
> the decodeUTF8 and take it return a new String(data). So I think that if
> the encoding is correctly recognized to be different from UTF-8 the
> decode method will return the new String that is the correct behaviour
> in my case.
>
> Please don't answer me to change my database to UNICODE. I cannot do
> that. And I do not WANT to do that. Why the ODBC driver works fine and
> the JDBC driver works only with UNICODE databases?? It's a bug and
> should be corrected. If I was skilled enough I corrected the bug myself
> but I don't know much about JDBC standard.
>
> I hope you answer to me with a solution. Really, the driver is simply
> unusable for serious work with this bug.
>
> The problem is not solved with the latest stable (version 7.3 build 109)
> and development (version 7.4 build 204) release of the driver.
>
> Regards, Romaz
> --
> Davide Romanini
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
--
Thomas O'Dowd  - Got a keitai? Get Nooped!
tom@nooper.com - http://nooper.com


pgsql-hackers by date:

Previous
From: "Peter Galbavy"
Date:
Subject: Re: More thoughts about FE/BE protocol
Next
From: mlw
Date:
Subject: Test databases