Thread: Hebrew support -- please help !
Hi all,
I develop an application (JAVA/JSP) on RedHat PostgreSQL 7.4.3 with PostgreSQL 7.4.2 JDBC3 with SSL (build 213).
This application needs to serve pages in LATIN1 and Hebrew.
For that I create 2 Databases, one DB in UNICODE encoding with client encoding LATIN1 when in the JSP the charset encoding is iso-8859-1 and it's working fine from client to server and server to client.
so I decide to do the same thing with Hebrew -- DB in UNICODE with client encoding ISO-8859-8 when in the JSP the charset encoding is iso-8859-8, then I got strange characters.
I also tried to create DB in UNICODE encoding with client encoding UNICODE when in the JSP the charset encoding is UNICODE, but same problem I got strange characters.
Does someone have a solution or a way to resolve this problem.
Elie
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
> Hi all, > > I develop an application (JAVA/JSP) on RedHat PostgreSQL 7.4.3 with PostgreSQL 7.4.2 JDBC3 with SSL (build 213). > > This application needs to serve pages in LATIN1 and Hebrew. > For that I create 2 Databases, one DB in UNICODE encoding with client encoding LATIN1 when in the JSP the charset encodingis iso-8859-1 and it's working fine from client to server and server to client. > > so I decide to do the same thing with Hebrew -- DB in UNICODE with client encoding ISO-8859-8 when in the JSP the charsetencoding is iso-8859-8, then I got strange characters. > > I also tried to create DB in UNICODE encoding with client encoding UNICODE when in the JSP the charset encoding is UNICODE,but same problem I got strange characters. > > Does someone have a solution or a way to resolve this problem. If I understand correctly, JDBC driver issues "set client_encoding to iso-8859-8" in your case. You should check it first. If it does the right thing, then you might want to the conversion maps. They are located: src/backend/utils/mb/Unicode/utf8_to_iso8859_8.map // UNICODE(UTF-8) -> ISO-8859-8 src/backend/utils/mb/Unicode/iso8859_8_to_utf8.map // ISO-8859-8 -> UNICODE(UTF-8) If you find anything wrong, please let me know. -- Tatsuo Ishii
Hi Elie, >> I develop an application (JAVA/JSP) on RedHat PostgreSQL 7.4.3 with >> PostgreSQL 7.4.2 JDBC3 with SSL (build 213). >> This application needs to serve pages in LATIN1 and Hebrew. >> For that I create 2 Databases, one DB in UNICODE encoding with client >> encoding LATIN1 when in the JSP the charset encoding is iso-8859-1 and >> it's working fine from client to server and server to client. >> so I decide to do the same thing with Hebrew -- DB in UNICODE with client >> encoding ISO-8859-8 when in the JSP the charset encoding is iso-8859-8, >> then I got strange characters. >> I also tried to create DB in UNICODE encoding with client encoding UNICODE >> when in the JSP the charset encoding is UNICODE, but same problem I got >> strange characters. >> Does someone have a solution or a way to resolve this problem. Are you sure your client side display and fonts are set correctly? Can you look at the ASCII codes you get back from Postgresto make sure they are not the correct Hebrew characters. You can send me some strings you got by email and I'll lookat it if you want. Bye, Guy.
Hi Elie, actually it is much simpler if you use Unicode encoding on both sides. Anyway, where do you see the strange characters? If you look at your DB through commandline psql it is crucial to have the correct encoding. I use to work on the commandline of my Linux box. I log in via putty, which on a Win98 Machine is not capable of displaying unicode characters correctly. Ulrich -- ------------------------------------------------------------ Relevant Traffic AB, Riddargatan 10, 11435 Stockholm, Sweden Tel. +46-8-6789750 http://www.relevanttraffic.se
Hi,
> If I understand correctly, JDBC driver issues "set client_encoding to
> iso-8859-8" in your case. You should check it first. If it does the
> right thing, then you might want to the conversion maps. They are located:
> src/backend/utils/mb/Unicode/utf8_to_iso8859_8.map // UNICODE(UTF-8) -> ISO-8859-8
> src/backend/utils/mb/Unicode/iso8859_8_to_utf8.map // ISO-8859-8 -> UNICODE(UTF-8)
> If you find anything wrong, please let me know.
> iso-8859-8" in your case. You should check it first. If it does the
> right thing, then you might want to the conversion maps. They are located:
> src/backend/utils/mb/Unicode/utf8_to_iso8859_8.map // UNICODE(UTF-8) -> ISO-8859-8
> src/backend/utils/mb/Unicode/iso8859_8_to_utf8.map // ISO-8859-8 -> UNICODE(UTF-8)
> If you find anything wrong, please let me know.
I installed postgresql from a rpm files:
* postgresql-7.4.3-2PGDG.i686.rpm
* postgresql-jdbc-7.4.3-2PGDG.i686.rpm
* postgresql-libs-7.4.3-2PGDG.i686.rpm
* postgresql-server-7.4.3-2PGDG.i686.rpm
* postgresql-server-7.4.3-2PGDG.i686.rpm
So in my installation there are no map file but there are a lot of so's in /usr/lib/pgsql.
I can observe that there is no utf8_and_iso8859_8.so file. How can I got/compile this file ?
The second solution, that I prefer but that failed was to work server/client side only in utf8. The DB was UNICODE, the 'show client_encoding' returned unicode and the charset in the jsp was utf-8. Any idea !?
Elie
Do you Yahoo!?
Yahoo! Mail is new and improved - Check it out!
> > If I understand correctly, JDBC driver issues "set client_encoding to > > iso-8859-8" in your case. You should check it first. If it does the > > right thing, then you might want to the conversion maps. They are located: > > > src/backend/utils/mb/Unicode/utf8_to_iso8859_8.map // UNICODE(UTF-8) -> ISO-8859-8 > > src/backend/utils/mb/Unicode/iso8859_8_to_utf8.map // ISO-8859-8 -> UNICODE(UTF-8) > > > If you find anything wrong, please let me know. > > I installed postgresql from a rpm files: > * postgresql-7.4.3-2PGDG.i686.rpm > * postgresql-jdbc-7.4.3-2PGDG.i686.rpm > * postgresql-libs-7.4.3-2PGDG.i686.rpm > * postgresql-server-7.4.3-2PGDG.i686.rpm > > So in my installation there are no map file but there are a lot of so's in /usr/lib/pgsql. > I can observe that there is no utf8_and_iso8859_8.so file. How can I got/compile this file ? It's in utf8_and_iso8859.so. > The second solution, that I prefer but that failed was to work server/client side only in utf8. The DB was UNICODE, the'show client_encoding' returned unicode and the charset in the jsp was utf-8. Any idea !? In PostgreSQL, unicode, utf8 and utf-8 are all equivalent. -- Tatsuo Ishii
Hi Guy,
> Are you sure your client side display and fonts are set correctly?
Yes, I can display Hebrew font.
> Can you look at the ASCII codes you get back from Postgres to make sure they are not the correct Hebrew characters.
I use pgAdmin3 on W2K to insert row in Hebrew and can see the right character.
My /etc/sysconfig/i18n file:
LANG="en_US.UTF-8"
SUPPORTED="fr_FR.UTF-8:fr_FR:fr:en_US.UTF-8:en_US:en"
SYSFONT="latarcyrheb-sun16"
SUPPORTED="fr_FR.UTF-8:fr_FR:fr:en_US.UTF-8:en_US:en"
SYSFONT="latarcyrheb-sun16"
Now I tried this configuration:
* DB in UNICODE, 'set client_encoding to UNICODE'
* Apache 2.0 + mod_jk2: don't see any encoding
* Tomcat 5: javaEncoding to UTF8 (default value)
* In each JSP:
<%@ page contentType="text/html;charset=utf-8" language="java"%>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Now I got the right information from the DB in Hebrew and french.
Still can't save value in Hebrew from client (browser) to server and in french if I save '�t�' in a column varchar(3) I got a range over exception.
What happen here ! something else to config ?
Elie
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.
> Now I got the right information from the DB in Hebrew and french. > Still can't save value in Hebrew from client (browser) to server and > in french if I save 'été' in a column varchar(3) I got a range over > exception. > > What happen here ! something else to config ? This is a Tomcat question... When processing forms set the encoding of the form element using the following attribute (in addition to all that you are doing): enctype= "text/plain;charset=UTF-8" eg <form action="blah" ... enctype= "text/plain;charset=UTF-8"> Also add a hidden field with special/accented characters in all your forms. Query the parameter value (of the hidden field) in your servlet/filter/jsp to see what you actually got. It could be that the browser didn't respect your encoding, and you don't get back what you put there in the first place! Hope that helps. John Sidney-Woollett Elie Nacache wrote: > Hi Guy, > > >>Are you sure your client side display and fonts are set correctly? > > > Yes, I can display Hebrew font. > > >>Can you look at the ASCII codes you get back from Postgres to make sure they are not the correct Hebrew characters. > > > I use pgAdmin3 on W2K to insert row in Hebrew and can see the right character. > > My /etc/sysconfig/i18n file: > LANG="en_US.UTF-8" > SUPPORTED="fr_FR.UTF-8:fr_FR:fr:en_US.UTF-8:en_US:en" > SYSFONT="latarcyrheb-sun16" > > Now I tried this configuration: > * DB in UNICODE, 'set client_encoding to UNICODE' > * Apache 2.0 + mod_jk2: don't see any encoding > * Tomcat 5: javaEncoding to UTF8 (default value) > * In each JSP: > <%@ page contentType="text/html;charset=utf-8" language="java"%> > <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> > > Now I got the right information from the DB in Hebrew and french. > Still can't save value in Hebrew from client (browser) to server and in french if I save 'été' in a column varchar(3) Igot a range over exception. > > What happen here ! something else to config ? > > Elie > > > --------------------------------- > Do you Yahoo!? > Win 1 of 4,000 free domain names from Yahoo! Enter now.
Hi Elie, > Now I got the right information from the DB in Hebrew and french. > Still can't save value in Hebrew from client (browser) to server and in > french if I save 'été' in a column varchar(3) I got a range over exception. you are working through the web. That makes the whole thing a lot more complex. Try to use UTF-8/UNICODE encoding all the way. That means configure Apache(!!!) and Tomact to server pages in UTF-8 encoding. All web pages pages should have <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> Check in your browser if UTF-8 encoding is used. On the other operating system in IE use View > Encoding, it should show UTF-8. Actually the browser to server communication is the hardest to ensure the right encoding is used. We have a database with Japanese, Hebrew, German, Swedish, French, Spanish and English content. Ulrich
Hi Ulrich
Finally I resolve the problem.
Here is the solution:
DB server side:
===========
* DB encoding UNICODE
* DB client encoding UNICODE
JSP side:
=======
* <%@ page contentType="text/html;charset=UTF-8" language="java"%>
* <%@ page pageEncoding="ISO-8859-1"%>
* request.setCharacterEncoding("UTF-8");
* <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Tomcat side: (server.xml file)
=========
<Connector port="8009" enableLookups="false" redirectPort="8443" debug="0"
protocol="AJP/1.3" URIEncoding="UTF-8"/>
For those that use a property bunle file (key/value) you need to save this file in
ISO-8859-1. if you have unicode characters in it you have to convert the file with native2ascii tool.
Thanks to all,
Elie
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.