Re: [JDBC] ArrayIndexOutOfBoundsException in Encoding.decodeUTF8() - Mailing list pgsql-general

From Barry Lind
Subject Re: [JDBC] ArrayIndexOutOfBoundsException in Encoding.decodeUTF8()
Date
Msg-id 3E1C9C54.1060903@xythos.com
Whole thread Raw
In response to Re: [JDBC] ArrayIndexOutOfBoundsException in Encoding.decodeUTF8()  (Joseph Shraibman <jks@selectacast.net>)
Responses Character Encoding WAS: ArrayIndexOutOfBoundsException in Encoding.decodeUTF8()
List pgsql-general

Joseph Shraibman wrote:
>>
>> In postgres UNICODE means utf8.
>
>
> Which differs from java unicode?
>

Yes.  Unicode in java is 16 bit characters (I think the term for this is
UCS2), two bytes for each character, whereas utf8 is a variable length
encoding with characters represented by 1, 2 or 3 bytes.

> I notice there is no way to change a database's encoding.  If I just
> change the encoding type in the pg_database to latin1 will there be data
> loss?

The recommended way to do this would be to dump the contents of the
database, create a new database with the desired character set and then
import the data into that new database.  I don't know if changing
pg_database directly would work or not.

--Barry



pgsql-general by date:

Previous
From: Joseph Shraibman
Date:
Subject: Re: [JDBC] ArrayIndexOutOfBoundsException in Encoding.decodeUTF8()
Next
From: Andreas Schlegel
Date:
Subject: Re: Drop foreign keys