Re: Character Encoding problem - Mailing list pgsql-jdbc

From Craig Ringer
Subject Re: Character Encoding problem
Date
Msg-id 47F9A464.5020501@postnewspapers.com.au
Whole thread Raw
In response to Character Encoding problem  ("antony baxter" <antony.baxter@gmail.com>)
Responses Re: Character Encoding problem  (Craig Ringer <craig@postnewspapers.com.au>)
List pgsql-jdbc
antony baxter wrote:

> Displaying 'input' character by character:
> Character 0 = '8211'
> Character 1 = '235'
> Character 2 = '8212'
> Character 3 = '196'
> Character 4 = '8212'
> Character 5 = '231'
> Character 6 = '8211'
> Character 7 = '937'
> Character 8 = '8212'
> Character 9 = '199'

There's your problem. Your *input* is mangled.

The above decodes to:

--e"---A"---c,--?---C,

So at some point you or some library you're using has done something
like read a utf-8 byte sequence from a file and shoved it character by
character into a String. Another possible culprit is a wrong (implicit?)
encoding conversion or cast from a byte array type to a unicode string type.

The JDBC is storing exactly what you tell it to, and the good 'ol GIGO
rule is being applied.

--
Craig Ringer

pgsql-jdbc by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Character Encoding problem
Next
From: Craig Ringer
Date:
Subject: Re: Character Encoding problem