Thread: encoding again
Hi, sorry that this email is a little bit long, but it is actully not :-)) **** I have a database 'unidb' created with -E UNICODE. $ psql -l List of databases Name | Owner | Encoding -----------+---------+----------- unidb | kathy | UNICODE **** I input Chinese data in unicode form. E.g. logging-threshold=\u65e5\u5fd7\u9608\u503c polling_setting_error=\u8bbe\u7f6e\u8f6e\u8be2\u95f4\u9694\u65f6\u51fa\u9519 unidb=# show client_encoding; NOTICE: Current client encoding is 'UNICODE' SHOW VARIABLE unidb=# select * from testbytes; name | value -------------------------+------------------------- logging_setting_error | 设置æ¥å¿éå¼æ¶åºé polling_setting_error | 设置轮询é´éæ¶åºé **** When I retrieve data, I did unidb=# set client_encoding to 'EUC_CN'; unidb=# show client_encoding; NOTICE: Current client encoding is 'EUC_CN' SHOW VARIABLE unidb=# select * from testbytes order by value; name | value -------------------------+------------------------- logging_setting_error | ־ֵʱ polling_setting_error | ѯʱ Three problems here: 1) the sorting is based on unicode value, not EUC_CN encoding value. 2) I wrote the ResultSet to a file by using OutputStreamWriter(file, "EUC_CN"). The file is not readable from the browser with any charset setting. 3) Changing client_encoding from UNICODE to EUC_CN actually alter/loose the data if you compare the above "select *" statements. I wonder why this happens ?? According to the doc, automatic encoding coversion between UNICODE and EIC_CN is supported. Any help is highly appreciated. thanks, kathy
Kathy Zhu writes: > 1) the sorting is based on unicode value, not EUC_CN encoding value. The sorting is always based on the server encoding. There is no way to change that. > 2) I wrote the ResultSet to a file by using OutputStreamWriter(file, "EUC_CN"). The > file is not readable from the browser with any charset setting. That is a problem in whatever client interface that is (Java?) or your browser. > 3) Changing client_encoding from UNICODE to EUC_CN actually alter/loose the data if > you compare the above "select *" statements. You're going to have to be a bit more specific, because many of us can't identify the characters or see what is wrong with them. Also, try a more recent PostgreSQL version, such as 7.3.4. -- Peter Eisentraut peter_e@gmx.net
Thanks for your reply !! I am using 7.3.1. 3) to be more specific about data change/loss after conversion input data in unicode with Client_encoding set to UNICODE logging-threshold=\u65e5\u5fd7\u9608\u503c polling_setting_error=\u8bbe\u7f6e\u8f6e\u8be2\u95f4\u9694\u65f6\u51fa\u9519 retrieved data with Client_encoding change to EUC_CN logging-threshold=\uFFFD\uFFFD\u05BE\uFFFD\uFFFD\u05B5 polling_setting_error=\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\uFFFD\u046F\uFFFD\uFFFD\uFF FD\u02B1\uFFFD\uFFFD thanks, kathy > X-Original-To: pgsql-general-postgresql.org@localhost.postgresql.org > Date: Wed, 10 Sep 2003 00:57:30 +0200 (CEST) > From: Peter Eisentraut <peter_e@gmx.net> > X-X-Sender: peter@peter.localdomain > To: Kathy Zhu <Kathy.Zhu@sun.com> > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] encoding again > X-Virus-Scanned: by amavisd-new at postgresql.org > X-Mailing-List: pgsql-general > > Kathy Zhu writes: > > > 1) the sorting is based on unicode value, not EUC_CN encoding value. > > The sorting is always based on the server encoding. There is no way to > change that. > > > 2) I wrote the ResultSet to a file by using OutputStreamWriter(file, "EUC_CN"). The > > file is not readable from the browser with any charset setting. > > That is a problem in whatever client interface that is (Java?) or your > browser. > > > 3) Changing client_encoding from UNICODE to EUC_CN actually alter/loose the data if > > you compare the above "select *" statements. > > You're going to have to be a bit more specific, because many of us can't > identify the characters or see what is wrong with them. > > Also, try a more recent PostgreSQL version, such as 7.3.4. > > -- > Peter Eisentraut peter_e@gmx.net > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org