German umlauts problem (under WindowsXP, COBOL programm) - Mailing list pgsql-general

From Libo Luo
Subject German umlauts problem (under WindowsXP, COBOL programm)
Date
Msg-id 20050310172805.15760.qmail@web51806.mail.yahoo.com
Whole thread Raw
List pgsql-general
Hello everyone,

my colleagues and I try to convert our old data base
system to PG.  We created a small client-server
prototype and used a java programm (J2SE, Version
1.4.1_01, JDBC-Treiber:  pgdev.307.jdbc3) to test.
Everything goes well and  the German umlauts (�, �, �,
�, �, �, �) can be inserted, updated and displayed
correctly.

Since we have already many  COBOL programms, so we
want to use them. Unfortunately we have problems in
handling the German umlauts. Here are the problems in
detail:

Operating system: WindowsXP Professional Version 2002,
with Service Package 1
PostgreSQL Version 8.0.0
Locale: German  (set during installation with
PG_installer)
default encoding: LATIN1 (set during installation with
PG_installer)
ODBC-Driver:  psqlodbc, Version 8.00.01.01 (of
05.03.2005)
Micro Focus Cobol, Compiler: NetExpress Version 4.0.38

The data base is created with the encoding Latin1
(createdb -E LATIN1 ....).

1) Insert umlauts:
Before we insert umlauts we have to in the COBOL
programms explicitly set the client_encoding to be
LATIN1, otherwise we get the error "could not convert
UTF-8 character 0x00e4 to ISO8859-1" (0x00e4 = 228 =
German "�" in ISO8859-1). After setting the
client_encodign to LATIN1, we can see through
pgAdminIII that the umlauts are saved correctly.

2) Read umlauts
To read the umlauts we have to set the client_encoding
to be UNICODE, otherwise we get only question mark (?)
instead of umlauts. If we set the client_encoding to
be LATIN1 we get also question marks.

3) Umlauts in select condition
Say text001 is "Vertr�ge". If we execute the query
"select ....from....where text > text001", we get the
error "could not convert UTF-8 character 0x00e4 to
ISO8859-1". (Here it is useless to set the
client_encoding to be LATIN1 or UNICODE).

We opened the PG-log (configurate in System-DSN) and
found out that, the PG-Server does the following:
a) checks that what the client_encoding is:
conn=3620872, query='select pg_client_encoding()'
    [ fetched 1 rows ]
    [ Client encoding = 'LATIN1' (code = 8)
b) After knowing the client_encoding it will set the
client_encoding to be UTF8.

So if we don't reset it to be LATIN1, the server will
think the client send UTF8-code and it will try to
convert UTF8-code to be ISO8859-1, which triggered the
error 1) during insertion of umlauts. But why the
server set the client_encoding to be UTF8? Should I
set some environment variables?

In the COBOL test program we do the following:
1) read a tuple which has umlauts from a table
2) then update it with new umlauts
3) at last read tupels whose "seltext > Kunden mit
Vertr�gen' .

Here is the log file psqlodbc_3888.log:

conn=3620872, PGAPI_DriverConnect(
in)='DSN=AVUSDB;UID=avus;PWD=xxxx;',
fDriverCompletion=0
DSN info:
DSN='AVUSDB',server='localhost',port='5432',dbase='avusdb',user='avus',passwd='xxxxx'

onlyread='0',protocol='6.4',showoid='0',fakeoidindex='0',showsystable='0'
          conn_settings='',conn_encoding='OTHER'
          translation_dll='',translation_option=''
Global Options: Version='08.00.0101', fetch=100,
socket=8192, unknown_sizes=0, max_varchar_size=254,
max_longvarchar_size=8190
                disable_optimizer=1, ksqo=1,
unique_index=1, use_declarefetch=0
                text_as_longvarchar=1,
unknowns_as_longvarchar=0, bools_as_char=1
NAMEDATALEN=64
                extra_systable_prefixes='dd_;',
conn_settings='' conn_encoding='OTHER'
conn=3620872, query=' '
conn=3620872, query='select version()'
    [ fetched 1 rows ]
    [ PostgreSQL version string = 'PostgreSQL 8.0.0 on
i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)' ]
    [ PostgreSQL version number = '8.0' ]
conn=3620872, query='set DateStyle to 'ISO''
conn=3620872, query='set geqo to 'OFF''
conn=3620872, query='set extra_float_digits to 2'
conn=3620872, query='select oid from pg_type where
typname='lo''
    [ fetched 0 rows ]
conn=3620872, query='select pg_client_encoding()'
    [ fetched 1 rows ]

    [ Client encoding = 'LATIN1' (code = 8) ]   //
this is the default encoding. That's OK.

conn=3620872, query='set client_encoding to 'UTF8''
// Why does the server set the client_encoding to be
UTF8 here????


conn=3620872,

PGAPI_DriverConnect(out)='DSN=AVUSDB;DATABASE=avusdb;SERVER=localhost;PORT=5432;UID=avus;PWD=xxxx;A6=;A7=100;A8=8192;B0=254;B1=8190;BI=0;C2=dd_;;CX=1b50fa9'
conn=3620872, query='SELECT * FROM DB31 WHERE SELNR =
'90001'       '
    [ fetched 1 rows ]
conn=3620872, query='UPDATE DB31 SET SELTEXT =
'������� update'          , SELANW1 = ' '          ,
SELANW2 = ' '          , SELANW3 = ' '          ,
SELANW4 = ' '          , SELANW5 = ' '          ,
SELNUTZ = ' '          , SELANZ = '0'         ,
SELSTEU = ' '          , SELETEXT = ' '           ,
SELKEY = ' '         WHERE SELNR = '90001'       '
ERROR from backend during send_query: 'ERROR:  could
not convert UTF-8 character 0x00e4 to ISO8859-1'
conn=3620872, query='ROLLBACK'
STATEMENT ERROR: func=SC_execute, desc='', errnum=7,
errmsg='Error while executing the query'

------------------------------------------------------------
                 hdbc=3620872, stmt=3654248,
result=3653112
                 manual_result=0, prepare=1,
internal=0
                 bindings=0, bindings_allocated=0
                 parameters=3648728,
parameters_allocated=12
                 statement_type=2, statement='UPDATE
DB31 SET SELTEXT = ?          , SELANW1 = ?          ,
SELANW2 = ?          , SELANW3 = ?          , SELANW4
= ?          , SELANW5 = ?          , SELNUTZ = ?
    , SELANZ = ?         , SELSTEU = ?          ,
SELETEXT = ?           , SELKEY = ?         WHERE
SELNR = ?       '
                 stmt_with_params='UPDATE DB31 SET
SELTEXT = '������� update'          , SELANW1 = ' '
      , SELANW2 = ' '          , SELANW3 = ' '
 , SELANW4 = ' '          , SELANW5 = ' '          ,
SELNUTZ = ' '          , SELANZ = '0'         ,
SELSTEU = ' '          , SELETEXT = ' '           ,
SELKEY = ' '         WHERE SELNR = '90001'       '
                 data_at_exec=-1,
current_exec_param=-1, put_data=0
                 currTuple=-1, current_col=-1,
lobj_fd=-1
                 maxRows=0, rowset_size=1,
keyset_size=0, cursor_type=0, scroll_concurrency=1
                 cursor_name='SQL_CUR0037C268'
                 ----------------QResult Info
-------------------------------
                 fields=3653912, manual_tuples=0,
backend_tuples=0, tupleField=0, conn=0
                 fetch_count=0, num_total_rows=0,
num_fields=0, cursor='(NULL)'
                 message='ERROR:  could not convert
UTF-8 character 0x00e4 to ISO8859-1',
command='(NULL)', notice='(NULL)'
                 status=7, inTuples=0
CONN ERROR: func=SC_execute, desc='', errnum=110,
errmsg='ERROR:  could not convert UTF-8 character
0x00e4 to ISO8859-1'

------------------------------------------------------------
            henv=3614544, conn=3620872, status=1,
num_stmts=16
            sock=3614616, stmts=3614704,
lobj_type=-999
            ---------------- Socket Info
-------------------------------
            socket=1856, reverse=0, errornumber=0,
errormsg='(NULL)'
            buffer_in=3631904, buffer_out=3640120
            buffer_filled_in=11, buffer_filled_out=0,
buffer_read_in=11
conn=3620872, query='SET CLIENT_ENCODING TO 'UNICODE''
conn=3620872, query='SELECT * FROM DB31 WHERE SELNR =
'90001'       '
    [ fetched 1 rows ]
conn=3620872, query='SELECT * FROM DB31 WHERE SELTEXT
> 'Kunden mit Vertr�gen'          OR (SELTEXT =
'Kunden mit Vertr�gen'          AND SELNR > ' '
) ORDER BY SELTEXT, SELNR '
ERROR from backend during send_query: 'ERROR:  could
not convert UTF-8 character 0x00e4 to ISO8859-1'
conn=3620872, query='ROLLBACK'
STATEMENT ERROR: func=SC_execute, desc='', errnum=7,
errmsg='Error while executing the query'

------------------------------------------------------------
                 hdbc=3620872, stmt=15597640,
result=3652848
                 manual_result=0, prepare=1,
internal=0
                 bindings=0, bindings_allocated=0
                 parameters=3653840,
parameters_allocated=3
                 statement_type=0, statement='SELECT *
FROM DB31 WHERE SELTEXT > ?          OR (SELTEXT = ?
       AND SELNR > ?       ) ORDER BY SELTEXT, SELNR '
                 stmt_with_params='SELECT * FROM DB31
WHERE SELTEXT > 'Kunden mit Vertr�gen'          OR
(SELTEXT = 'Kunden mit Vertr�gen'          AND SELNR >
' '       ) ORDER BY SELTEXT, SELNR '
                 data_at_exec=-1,
current_exec_param=-1, put_data=0
                 currTuple=-1, current_col=-1,
lobj_fd=-1
                 maxRows=0, rowset_size=1,
keyset_size=0, cursor_type=0, scroll_concurrency=1
                 cursor_name='nxc1'
                 ----------------QResult Info
-------------------------------
                 fields=3648408, manual_tuples=0,
backend_tuples=0, tupleField=0, conn=0
                 fetch_count=0, num_total_rows=0,
num_fields=0, cursor='(NULL)'
                 message='ERROR:  could not convert
UTF-8 character 0x00e4 to ISO8859-1',
command='(NULL)', notice='(NULL)'
                 status=7, inTuples=0
CONN ERROR: func=SC_execute, desc='', errnum=110,
errmsg='ERROR:  could not convert UTF-8 character
0x00e4 to ISO8859-1'

------------------------------------------------------------
            henv=3614544, conn=3620872, status=1,
num_stmts=16
            sock=3614616, stmts=3614704,
lobj_type=-999
            ---------------- Socket Info
-------------------------------
            socket=1856, reverse=0, errornumber=0,
errormsg='(NULL)'
            buffer_in=3631904, buffer_out=3640120
            buffer_filled_in=11, buffer_filled_out=0,
buffer_read_in=11

Thank you very much in advance.

Libo

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

pgsql-general by date:

Previous
From: James
Date:
Subject: Postgres stored proc that extracts data from Oracle
Next
From: Edmund Bacon
Date:
Subject: Re: fied separator change from the shell command line