Thread: using 8 bit ascii
I have a Postgres database (version 7.4.2) that is using acsii character 233 which is an 8 bit ascii character. I also use jboss. My problem is when I try to retrieve a resultset that has a record with one of the 8bit ascii characters I get a message from jboss (see error message below. My question is there a way to configure the postgres jdbc driver to allow for this range of characters? 2004-10-26 16:54:51,167 ERROR [STDERR] org.postgresql.util.PSQLException: Invalid character data was found. This is most likely caused by stored data containing characters that are invalid for the character set the database was created in. The most common example of this is storing 8bit data in a SQL_ASCII database. 2004-10-26 16:54:51,167 ERROR [STDERR] at org.postgresql.core.Encoding.decodeUTF8(Encoding.java:287) 2004-10-26 16:54:51,167 ERROR [STDERR] at org.postgresql.core.Encoding.decode(Encoding.java:182) 2004-10-26 16:54:51,167 ERROR [STDERR] at org.postgresql.core.Encoding.decode(Encoding.java:198) 2004-10-26 16:54:51,167 ERROR [STDERR] at org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul tSet.java:201) 2004-10-26 16:54:51,167 ERROR [STDERR] at org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul tSet.java:475) 2004-10-26 16:54:51,168 ERROR [STDERR] at payroll.DeptWorkers.loadWorkers(DeptWorkers.java:52) 2004-10-26 16:54:51,168 ERROR [STDERR] at org.apache.jsp.manager_jsp._jspService(manager_jsp.jav
Hello Jason, ASCII is only 7-bit. Values 0 to 127. ISO-8859-1 is an example of a character set with 8-bits (0 to 255). 233 is é in ISO-8859-1 (Latin-1). You should create the database with an encoding which can handle 8-bit characters. I.e. ISO-8859-1 (Postgresql: Latin-1) or UTF-8 (Postgresql: UNICODE) Anders * Jason Tesser (JTesser@nbbc.edu) wrote: > I have a Postgres database (version 7.4.2) that is using acsii character > 233 which is an 8 > bit ascii character. I also use jboss. My problem is when I try to > retrieve > a resultset that has a record with one of the 8bit ascii characters I > get a > message from jboss (see error message below. > > My question is there a way to configure the postgres jdbc driver to > allow > for this range of characters? > > 2004-10-26 16:54:51,167 ERROR [STDERR] > org.postgresql.util.PSQLException: Invalid character data was found. > This is most likely caused by stored data containing characters that are > invalid for the character set the database was created in. The most > common example of this is storing 8bit data in a SQL_ASCII database. > 2004-10-26 16:54:51,167 ERROR [STDERR] at > org.postgresql.core.Encoding.decodeUTF8(Encoding.java:287) > 2004-10-26 16:54:51,167 ERROR [STDERR] at > org.postgresql.core.Encoding.decode(Encoding.java:182) > 2004-10-26 16:54:51,167 ERROR [STDERR] at > org.postgresql.core.Encoding.decode(Encoding.java:198) > 2004-10-26 16:54:51,167 ERROR [STDERR] at > org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul > tSet.java:201) > 2004-10-26 16:54:51,167 ERROR [STDERR] at > org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul > tSet.java:475) > 2004-10-26 16:54:51,168 ERROR [STDERR] at > payroll.DeptWorkers.loadWorkers(DeptWorkers.java:52) > 2004-10-26 16:54:51,168 ERROR [STDERR] at > org.apache.jsp.manager_jsp._jspService(manager_jsp.jav >
OK I tried the Unicode but the data won't come in as it says it cannot support the Unicode values I am inserting. I triedconverting the data as a text file and everything. Nothing has worked there. With odbc using access for example Ican pull the 8 bit characters out just fine from the same database. So why can I not using postgres jdbc? I understand that ascii is 7 bit but these are extended ascii. I will try Latin 1 > -----Original Message----- > From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc- > owner@postgresql.org] On Behalf Of Anders Hermansen > Sent: Wednesday, October 27, 2004 8:12 AM > To: pgsql-jdbc@postgresql.org > Subject: Re: [JDBC] using 8 bit ascii > > Hello Jason, > > ASCII is only 7-bit. Values 0 to 127. > > ISO-8859-1 is an example of a character set with 8-bits (0 to 255). > 233 is é in ISO-8859-1 (Latin-1). > > You should create the database with an encoding which can handle 8-bit > characters. I.e. ISO-8859-1 (Postgresql: Latin-1) or UTF-8 (Postgresql: > UNICODE) > > > Anders > > * Jason Tesser (JTesser@nbbc.edu) wrote: > > I have a Postgres database (version 7.4.2) that is using acsii character > > 233 which is an 8 > > bit ascii character. I also use jboss. My problem is when I try to > > retrieve > > a resultset that has a record with one of the 8bit ascii characters I > > get a > > message from jboss (see error message below. > > > > My question is there a way to configure the postgres jdbc driver to > > allow > > for this range of characters? > > > > 2004-10-26 16:54:51,167 ERROR [STDERR] > > org.postgresql.util.PSQLException: Invalid character data was found. > > This is most likely caused by stored data containing characters that are > > invalid for the character set the database was created in. The most > > common example of this is storing 8bit data in a SQL_ASCII database. > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > org.postgresql.core.Encoding.decodeUTF8(Encoding.java:287) > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > org.postgresql.core.Encoding.decode(Encoding.java:182) > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > org.postgresql.core.Encoding.decode(Encoding.java:198) > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul > > tSet.java:201) > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul > > tSet.java:475) > > 2004-10-26 16:54:51,168 ERROR [STDERR] at > > payroll.DeptWorkers.loadWorkers(DeptWorkers.java:52) > > 2004-10-26 16:54:51,168 ERROR [STDERR] at > > org.apache.jsp.manager_jsp._jspService(manager_jsp.jav > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
Is it JDBC driver or another interface that says it cannot support the Unicode values when you insert? What is the exact error message you get? If you use for example ODBC and insert Latin-1 characters in a Unicode db, things will go wrong. You can issue the following statement: SET CLIENT_ENCODING TO 'LATIN1'; This will tell postgresql to expect latin1 characters. Postgresql will then automatic convert to correct character set if necessary. The JDBC driver will always operate in UNICODE mode, so it should not have any problems with either Latin1 nor unicode databases. I use JDBC driver with both Latin1 and Unicode databases with no problems. I also use psql for some scripts, but I have ISO-8859-1 terminal, so I execute the above query first. (Actually I have "\encoding LATIN1" in my .psqlrc file). Anders * Jason Tesser (JTesser@nbbc.edu) wrote: > OK I tried the Unicode but the data won't come in as it says it cannot support the Unicode values I am inserting. I triedconverting the data as a text file and everything. Nothing has worked there. With odbc using access for example Ican pull the 8 bit characters out just fine from the same database. So why can I not using postgres jdbc? > I understand that ascii is 7 bit but these are extended ascii. I will try Latin 1 > > > -----Original Message----- > > From: pgsql-jdbc-owner@postgresql.org [mailto:pgsql-jdbc- > > owner@postgresql.org] On Behalf Of Anders Hermansen > > Sent: Wednesday, October 27, 2004 8:12 AM > > To: pgsql-jdbc@postgresql.org > > Subject: Re: [JDBC] using 8 bit ascii > > > > Hello Jason, > > > > ASCII is only 7-bit. Values 0 to 127. > > > > ISO-8859-1 is an example of a character set with 8-bits (0 to 255). > > 233 is é in ISO-8859-1 (Latin-1). > > > > You should create the database with an encoding which can handle 8-bit > > characters. I.e. ISO-8859-1 (Postgresql: Latin-1) or UTF-8 (Postgresql: > > UNICODE) > > > > > > Anders > > > > * Jason Tesser (JTesser@nbbc.edu) wrote: > > > I have a Postgres database (version 7.4.2) that is using acsii character > > > 233 which is an 8 > > > bit ascii character. I also use jboss. My problem is when I try to > > > retrieve > > > a resultset that has a record with one of the 8bit ascii characters I > > > get a > > > message from jboss (see error message below. > > > > > > My question is there a way to configure the postgres jdbc driver to > > > allow > > > for this range of characters? > > > > > > 2004-10-26 16:54:51,167 ERROR [STDERR] > > > org.postgresql.util.PSQLException: Invalid character data was found. > > > This is most likely caused by stored data containing characters that are > > > invalid for the character set the database was created in. The most > > > common example of this is storing 8bit data in a SQL_ASCII database. > > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > > org.postgresql.core.Encoding.decodeUTF8(Encoding.java:287) > > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > > org.postgresql.core.Encoding.decode(Encoding.java:182) > > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > > org.postgresql.core.Encoding.decode(Encoding.java:198) > > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > > org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul > > > tSet.java:201) > > > 2004-10-26 16:54:51,167 ERROR [STDERR] at > > > org.postgresql.jdbc1.AbstractJdbc1ResultSet.getString(AbstractJdbc1Resul > > > tSet.java:475) > > > 2004-10-26 16:54:51,168 ERROR [STDERR] at > > > payroll.DeptWorkers.loadWorkers(DeptWorkers.java:52) > > > 2004-10-26 16:54:51,168 ERROR [STDERR] at > > > org.apache.jsp.manager_jsp._jspService(manager_jsp.jav > > > > >
Jason Tesser wrote: > 2004-10-26 16:54:51,167 ERROR [STDERR] > org.postgresql.util.PSQLException: Invalid character data was found. > This is most likely caused by stored data containing characters that are > invalid for the character set the database was created in. The most > common example of this is storing 8bit data in a SQL_ASCII database. As the error says, this problem usually arises from storing 8 bit data in a SQL_ASCII database.. The JDBC driver always sets client_encoding = UNICODE and expects the data arriving from the server to be UTF8 ("unicode") encoded. When you have a SQL_ASCII database, the server has no information as to how to translate characters above 127 into corresponding unicode values, so it just passes them straight out. Then JDBC complains about invalid unicode sequences. It's not just a case of somehow making the JDBC driver accept those sequences; the driver really does need them translated to unicode as Java's internal string format uses a unicode representation. To do this translation, you need information about the actual encoding the data is using. For post-7.2 servers, the JDBC driver chooses to let the server deal with this, so you need to get the encoding information right on the database side. So you will need to recreate your database using an appropriate encoding that reflects the data stored in it. Presumably those high-ascii sequences already in the database are *not* unicode, they're probably ISO-8859-1 or something similar? In that case you can probably dump&load into database created with the LATIN1 encoding. -O