German umlauts problem (under WindowsXP, COBOL programm) - Mailing list pgsql-general
From | Libo Luo |
---|---|
Subject | German umlauts problem (under WindowsXP, COBOL programm) |
Date | |
Msg-id | 20050310172805.15760.qmail@web51806.mail.yahoo.com Whole thread Raw |
List | pgsql-general |
Hello everyone, my colleagues and I try to convert our old data base system to PG. We created a small client-server prototype and used a java programm (J2SE, Version 1.4.1_01, JDBC-Treiber: pgdev.307.jdbc3) to test. Everything goes well and the German umlauts (�, �, �, �, �, �, �) can be inserted, updated and displayed correctly. Since we have already many COBOL programms, so we want to use them. Unfortunately we have problems in handling the German umlauts. Here are the problems in detail: Operating system: WindowsXP Professional Version 2002, with Service Package 1 PostgreSQL Version 8.0.0 Locale: German (set during installation with PG_installer) default encoding: LATIN1 (set during installation with PG_installer) ODBC-Driver: psqlodbc, Version 8.00.01.01 (of 05.03.2005) Micro Focus Cobol, Compiler: NetExpress Version 4.0.38 The data base is created with the encoding Latin1 (createdb -E LATIN1 ....). 1) Insert umlauts: Before we insert umlauts we have to in the COBOL programms explicitly set the client_encoding to be LATIN1, otherwise we get the error "could not convert UTF-8 character 0x00e4 to ISO8859-1" (0x00e4 = 228 = German "�" in ISO8859-1). After setting the client_encodign to LATIN1, we can see through pgAdminIII that the umlauts are saved correctly. 2) Read umlauts To read the umlauts we have to set the client_encoding to be UNICODE, otherwise we get only question mark (?) instead of umlauts. If we set the client_encoding to be LATIN1 we get also question marks. 3) Umlauts in select condition Say text001 is "Vertr�ge". If we execute the query "select ....from....where text > text001", we get the error "could not convert UTF-8 character 0x00e4 to ISO8859-1". (Here it is useless to set the client_encoding to be LATIN1 or UNICODE). We opened the PG-log (configurate in System-DSN) and found out that, the PG-Server does the following: a) checks that what the client_encoding is: conn=3620872, query='select pg_client_encoding()' [ fetched 1 rows ] [ Client encoding = 'LATIN1' (code = 8) b) After knowing the client_encoding it will set the client_encoding to be UTF8. So if we don't reset it to be LATIN1, the server will think the client send UTF8-code and it will try to convert UTF8-code to be ISO8859-1, which triggered the error 1) during insertion of umlauts. But why the server set the client_encoding to be UTF8? Should I set some environment variables? In the COBOL test program we do the following: 1) read a tuple which has umlauts from a table 2) then update it with new umlauts 3) at last read tupels whose "seltext > Kunden mit Vertr�gen' . Here is the log file psqlodbc_3888.log: conn=3620872, PGAPI_DriverConnect( in)='DSN=AVUSDB;UID=avus;PWD=xxxx;', fDriverCompletion=0 DSN info: DSN='AVUSDB',server='localhost',port='5432',dbase='avusdb',user='avus',passwd='xxxxx' onlyread='0',protocol='6.4',showoid='0',fakeoidindex='0',showsystable='0' conn_settings='',conn_encoding='OTHER' translation_dll='',translation_option='' Global Options: Version='08.00.0101', fetch=100, socket=8192, unknown_sizes=0, max_varchar_size=254, max_longvarchar_size=8190 disable_optimizer=1, ksqo=1, unique_index=1, use_declarefetch=0 text_as_longvarchar=1, unknowns_as_longvarchar=0, bools_as_char=1 NAMEDATALEN=64 extra_systable_prefixes='dd_;', conn_settings='' conn_encoding='OTHER' conn=3620872, query=' ' conn=3620872, query='select version()' [ fetched 1 rows ] [ PostgreSQL version string = 'PostgreSQL 8.0.0 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw-special)' ] [ PostgreSQL version number = '8.0' ] conn=3620872, query='set DateStyle to 'ISO'' conn=3620872, query='set geqo to 'OFF'' conn=3620872, query='set extra_float_digits to 2' conn=3620872, query='select oid from pg_type where typname='lo'' [ fetched 0 rows ] conn=3620872, query='select pg_client_encoding()' [ fetched 1 rows ] [ Client encoding = 'LATIN1' (code = 8) ] // this is the default encoding. That's OK. conn=3620872, query='set client_encoding to 'UTF8'' // Why does the server set the client_encoding to be UTF8 here???? conn=3620872, PGAPI_DriverConnect(out)='DSN=AVUSDB;DATABASE=avusdb;SERVER=localhost;PORT=5432;UID=avus;PWD=xxxx;A6=;A7=100;A8=8192;B0=254;B1=8190;BI=0;C2=dd_;;CX=1b50fa9' conn=3620872, query='SELECT * FROM DB31 WHERE SELNR = '90001' ' [ fetched 1 rows ] conn=3620872, query='UPDATE DB31 SET SELTEXT = '������� update' , SELANW1 = ' ' , SELANW2 = ' ' , SELANW3 = ' ' , SELANW4 = ' ' , SELANW5 = ' ' , SELNUTZ = ' ' , SELANZ = '0' , SELSTEU = ' ' , SELETEXT = ' ' , SELKEY = ' ' WHERE SELNR = '90001' ' ERROR from backend during send_query: 'ERROR: could not convert UTF-8 character 0x00e4 to ISO8859-1' conn=3620872, query='ROLLBACK' STATEMENT ERROR: func=SC_execute, desc='', errnum=7, errmsg='Error while executing the query' ------------------------------------------------------------ hdbc=3620872, stmt=3654248, result=3653112 manual_result=0, prepare=1, internal=0 bindings=0, bindings_allocated=0 parameters=3648728, parameters_allocated=12 statement_type=2, statement='UPDATE DB31 SET SELTEXT = ? , SELANW1 = ? , SELANW2 = ? , SELANW3 = ? , SELANW4 = ? , SELANW5 = ? , SELNUTZ = ? , SELANZ = ? , SELSTEU = ? , SELETEXT = ? , SELKEY = ? WHERE SELNR = ? ' stmt_with_params='UPDATE DB31 SET SELTEXT = '������� update' , SELANW1 = ' ' , SELANW2 = ' ' , SELANW3 = ' ' , SELANW4 = ' ' , SELANW5 = ' ' , SELNUTZ = ' ' , SELANZ = '0' , SELSTEU = ' ' , SELETEXT = ' ' , SELKEY = ' ' WHERE SELNR = '90001' ' data_at_exec=-1, current_exec_param=-1, put_data=0 currTuple=-1, current_col=-1, lobj_fd=-1 maxRows=0, rowset_size=1, keyset_size=0, cursor_type=0, scroll_concurrency=1 cursor_name='SQL_CUR0037C268' ----------------QResult Info ------------------------------- fields=3653912, manual_tuples=0, backend_tuples=0, tupleField=0, conn=0 fetch_count=0, num_total_rows=0, num_fields=0, cursor='(NULL)' message='ERROR: could not convert UTF-8 character 0x00e4 to ISO8859-1', command='(NULL)', notice='(NULL)' status=7, inTuples=0 CONN ERROR: func=SC_execute, desc='', errnum=110, errmsg='ERROR: could not convert UTF-8 character 0x00e4 to ISO8859-1' ------------------------------------------------------------ henv=3614544, conn=3620872, status=1, num_stmts=16 sock=3614616, stmts=3614704, lobj_type=-999 ---------------- Socket Info ------------------------------- socket=1856, reverse=0, errornumber=0, errormsg='(NULL)' buffer_in=3631904, buffer_out=3640120 buffer_filled_in=11, buffer_filled_out=0, buffer_read_in=11 conn=3620872, query='SET CLIENT_ENCODING TO 'UNICODE'' conn=3620872, query='SELECT * FROM DB31 WHERE SELNR = '90001' ' [ fetched 1 rows ] conn=3620872, query='SELECT * FROM DB31 WHERE SELTEXT > 'Kunden mit Vertr�gen' OR (SELTEXT = 'Kunden mit Vertr�gen' AND SELNR > ' ' ) ORDER BY SELTEXT, SELNR ' ERROR from backend during send_query: 'ERROR: could not convert UTF-8 character 0x00e4 to ISO8859-1' conn=3620872, query='ROLLBACK' STATEMENT ERROR: func=SC_execute, desc='', errnum=7, errmsg='Error while executing the query' ------------------------------------------------------------ hdbc=3620872, stmt=15597640, result=3652848 manual_result=0, prepare=1, internal=0 bindings=0, bindings_allocated=0 parameters=3653840, parameters_allocated=3 statement_type=0, statement='SELECT * FROM DB31 WHERE SELTEXT > ? OR (SELTEXT = ? AND SELNR > ? ) ORDER BY SELTEXT, SELNR ' stmt_with_params='SELECT * FROM DB31 WHERE SELTEXT > 'Kunden mit Vertr�gen' OR (SELTEXT = 'Kunden mit Vertr�gen' AND SELNR > ' ' ) ORDER BY SELTEXT, SELNR ' data_at_exec=-1, current_exec_param=-1, put_data=0 currTuple=-1, current_col=-1, lobj_fd=-1 maxRows=0, rowset_size=1, keyset_size=0, cursor_type=0, scroll_concurrency=1 cursor_name='nxc1' ----------------QResult Info ------------------------------- fields=3648408, manual_tuples=0, backend_tuples=0, tupleField=0, conn=0 fetch_count=0, num_total_rows=0, num_fields=0, cursor='(NULL)' message='ERROR: could not convert UTF-8 character 0x00e4 to ISO8859-1', command='(NULL)', notice='(NULL)' status=7, inTuples=0 CONN ERROR: func=SC_execute, desc='', errnum=110, errmsg='ERROR: could not convert UTF-8 character 0x00e4 to ISO8859-1' ------------------------------------------------------------ henv=3614544, conn=3620872, status=1, num_stmts=16 sock=3614616, stmts=3614704, lobj_type=-999 ---------------- Socket Info ------------------------------- socket=1856, reverse=0, errornumber=0, errormsg='(NULL)' buffer_in=3631904, buffer_out=3640120 buffer_filled_in=11, buffer_filled_out=0, buffer_read_in=11 Thank you very much in advance. Libo __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
pgsql-general by date: