Thread: character conversion problem about UTF-8-->SHIFT_JIS_2004

character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"bh yuan"
Date:
hi

I used Postgresql7.4.3 with php for more than 3years.
Now I want to change my database to Postgresql8.3.
But I occur such problem
----------------------------------------------------------
ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
"SHIFT_JIS_2004"
----------------------------------------------------------
The database was encoded by UTF-8,
to export data as .csv file,
I use  set client_encoding='SJIS' at client.
When I use Postgresql7.4.3,no problem occur,
but after I chaged to Postgresql8.3 ,the error was occured.

Can I ignore the error message ?
or any othe method to solve this problem.

Thanks

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"Hiroshi Saito"
Date:
Hi.

----- Original Message -----
From: "bh yuan" <bhyuan@gmail.com>

> hi
>
> I used Postgresql7.4.3 with php for more than 3years.
> Now I want to change my database to Postgresql8.3.
> But I occur such problem
> ----------------------------------------------------------
> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> "SHIFT_JIS_2004"
> ----------------------------------------------------------
> The database was encoded by UTF-8,

It SERVER_ENCODING=UTF-8 is Ok.

> to export data as .csv file,
> I use  set client_encoding='SJIS' at client.

No, you should use UTF-8 of default.

> When I use Postgresql7.4.3,no problem occur,

It seems that it has loose check....

> but after I chaged to Postgresql8.3 ,the error was occured.
>
> Can I ignore the error message ?
> or any othe method to solve this problem.

"0xe9ab99" which you use is famous UNICODE.
Then, An error is right. (not SJIS)

Regards,
Hiroshi Saito

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"Hiroshi Saito"
Date:
Ooops, shortage of information..sorry.
Please see,
http://winpg.jp/~saito/pg83/HASHIGODATA/

> Hi.
>
> ----- Original Message -----
> From: "bh yuan" <bhyuan@gmail.com>
>
>> hi
>>
>> I used Postgresql7.4.3 with php for more than 3years.
>> Now I want to change my database to Postgresql8.3.
>> But I occur such problem
>> ----------------------------------------------------------
>> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
>> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
>> "SHIFT_JIS_2004"
>> ----------------------------------------------------------
>> The database was encoded by UTF-8,
>
> It SERVER_ENCODING=UTF-8 is Ok.
>
>> to export data as .csv file,
>> I use  set client_encoding='SJIS' at client.
>
> No, you should use UTF-8 of default.
>
>> When I use Postgresql7.4.3,no problem occur,
>
> It seems that it has loose check....
>
>> but after I chaged to Postgresql8.3 ,the error was occured.
>>
>> Can I ignore the error message ?
>> or any othe method to solve this problem.
>
> "0xe9ab99" which you use is famous UNICODE.
> Then, An error is right. (not SJIS)
>
> Regards,
> Hiroshi Saito

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"bh yuan"
Date:
Thanks for your replay.

The  "0xe9ab99" is not SJIS nor SHIFT_JIS_2004.
But I shoud export data with not regular SJIS character
from old database(7.4.3) to new database(8.3),
and use the old programe which export data as SJIS encoding .csv file.
Can I modify conf file to ignore the error?
or check inigore character from the database and convert it to regular
SJIS encoding character with some tools ?

And
> >> to export data as .csv file,
> >> I use  set client_encoding='SJIS' at client.
> >
> > No, you should use UTF-8 of default.
means export a UTF-8 encoding csv file?

Thanks

2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
> Ooops, shortage of information..sorry.
> Please see,
> http://winpg.jp/~saito/pg83/HASHIGODATA/
>
> > Hi.
> >
> > ----- Original Message -----
> > From: "bh yuan" <bhyuan@gmail.com>
> >
> >> hi
> >>
> >> I used Postgresql7.4.3 with php for more than 3years.
> >> Now I want to change my database to Postgresql8.3.
> >> But I occur such problem
> >> ----------------------------------------------------------
> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> >> "SHIFT_JIS_2004"
> >> ----------------------------------------------------------
> >> The database was encoded by UTF-8,
> >
> > It SERVER_ENCODING=UTF-8 is Ok.
> >
> >> to export data as .csv file,
> >> I use  set client_encoding='SJIS' at client.
> >
> > No, you should use UTF-8 of default.
> >
> >> When I use Postgresql7.4.3,no problem occur,
> >
> > It seems that it has loose check....
> >
> >> but after I chaged to Postgresql8.3 ,the error was occured.
> >>
> >> Can I ignore the error message ?
> >> or any othe method to solve this problem.
> >
> > "0xe9ab99" which you use is famous UNICODE.
> > Then, An error is right. (not SJIS)
> >
> > Regards,
> > Hiroshi Saito
>

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"Hiroshi Saito"
Date:
Hi.

----- Original Message -----
From: "bh yuan" <bhyuan@gmail.com>


> Thanks for your replay.
>
> The  "0xe9ab99" is not SJIS nor SHIFT_JIS_2004.

Ahh Ok, you already understood.  :-)

> But I shoud export data with not regular SJIS character
> from old database(7.4.3) to new database(8.3),
> and use the old programe which export data as SJIS encoding .csv file.
> Can I modify conf file to ignore the error?
> or check inigore character from the database and convert it to regular
> SJIS encoding character with some tools ?

Although it is unknown in whether the inside of .csv file which you acquired
is SJIS...However, as one method,
If you were operating by SERVER_ENCODING of UTF-8, it will be able to
bring without conversion. Then, an environment variable will be helpful.

--
set PGCLIENTENCODING=UTF-8
pg_dump
--

Regards,
Hiroshi Saito

>
> And
>> >> to export data as .csv file,
>> >> I use  set client_encoding='SJIS' at client.
>> >
>> > No, you should use UTF-8 of default.
> means export a UTF-8 encoding csv file?
>
> Thanks
>
> 2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
>> Ooops, shortage of information..sorry.
>> Please see,
>> http://winpg.jp/~saito/pg83/HASHIGODATA/
>>
>> > Hi.
>> >
>> > ----- Original Message -----
>> > From: "bh yuan" <bhyuan@gmail.com>
>> >
>> >> hi
>> >>
>> >> I used Postgresql7.4.3 with php for more than 3years.
>> >> Now I want to change my database to Postgresql8.3.
>> >> But I occur such problem
>> >> ----------------------------------------------------------
>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
>> >> "SHIFT_JIS_2004"
>> >> ----------------------------------------------------------
>> >> The database was encoded by UTF-8,
>> >
>> > It SERVER_ENCODING=UTF-8 is Ok.
>> >
>> >> to export data as .csv file,
>> >> I use  set client_encoding='SJIS' at client.
>> >
>> > No, you should use UTF-8 of default.
>> >
>> >> When I use Postgresql7.4.3,no problem occur,
>> >
>> > It seems that it has loose check....
>> >
>> >> but after I chaged to Postgresql8.3 ,the error was occured.
>> >>
>> >> Can I ignore the error message ?
>> >> or any othe method to solve this problem.
>> >
>> > "0xe9ab99" which you use is famous UNICODE.
>> > Then, An error is right. (not SJIS)
>> >
>> > Regards,
>> > Hiroshi Saito
>>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"Hiroshi Saito"
Date:
Oops again..

or
--
set PGCLIENTENCODING=SJIS
pg_dump
or
copy of psql's.
--
However, an environmental setup is as follows again.

export PGCLIENTENCODING=SJIS
:-)

Regards,
Hiroshi Saito

----- Original Message -----
From: "Hiroshi Saito" <z-saito@guitar.ocn.ne.jp>


> Hi.
>
> ----- Original Message -----
> From: "bh yuan" <bhyuan@gmail.com>
>
>
>> Thanks for your replay.
>>
>> The  "0xe9ab99" is not SJIS nor SHIFT_JIS_2004.
>
> Ahh Ok, you already understood.  :-)
>
>> But I shoud export data with not regular SJIS character
>> from old database(7.4.3) to new database(8.3),
>> and use the old programe which export data as SJIS encoding .csv file.
>> Can I modify conf file to ignore the error?
>> or check inigore character from the database and convert it to regular
>> SJIS encoding character with some tools ?
>
> Although it is unknown in whether the inside of .csv file which you acquired
> is SJIS...However, as one method,
> If you were operating by SERVER_ENCODING of UTF-8, it will be able to
> bring without conversion. Then, an environment variable will be helpful.
>
> --
> set PGCLIENTENCODING=UTF-8
> pg_dump
> --
>
> Regards,
> Hiroshi Saito
>
>>
>> And
>>> >> to export data as .csv file,
>>> >> I use  set client_encoding='SJIS' at client.
>>> >
>>> > No, you should use UTF-8 of default.
>> means export a UTF-8 encoding csv file?
>>
>> Thanks
>>
>> 2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
>>> Ooops, shortage of information..sorry.
>>> Please see,
>>> http://winpg.jp/~saito/pg83/HASHIGODATA/
>>>
>>> > Hi.
>>> >
>>> > ----- Original Message -----
>>> > From: "bh yuan" <bhyuan@gmail.com>
>>> >
>>> >> hi
>>> >>
>>> >> I used Postgresql7.4.3 with php for more than 3years.
>>> >> Now I want to change my database to Postgresql8.3.
>>> >> But I occur such problem
>>> >> ----------------------------------------------------------
>>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
>>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
>>> >> "SHIFT_JIS_2004"
>>> >> ----------------------------------------------------------
>>> >> The database was encoded by UTF-8,
>>> >
>>> > It SERVER_ENCODING=UTF-8 is Ok.
>>> >
>>> >> to export data as .csv file,
>>> >> I use  set client_encoding='SJIS' at client.
>>> >
>>> > No, you should use UTF-8 of default.
>>> >
>>> >> When I use Postgresql7.4.3,no problem occur,
>>> >
>>> > It seems that it has loose check....
>>> >
>>> >> but after I chaged to Postgresql8.3 ,the error was occured.
>>> >>
>>> >> Can I ignore the error message ?
>>> >> or any othe method to solve this problem.
>>> >
>>> > "0xe9ab99" which you use is famous UNICODE.
>>> > Then, An error is right. (not SJIS)
>>> >
>>> > Regards,
>>> > Hiroshi Saito
>>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: Have you checked our extensive FAQ?
>>
>>               http://www.postgresql.org/docs/faq
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"bh yuan"
Date:
I think i can export olddb(7.4) by encoding SJIS,
but can not import it into new db(8.3) for some unregular sjis character.

And
I create the database by encoding UTF-8 .
---
createdb -E UNICODE db
---
I think it is the same to set PGCLIENTENCODING=UTF-8
.
use set client_encoding=''SJIS be the same to export PGCLIENTENCODING=SJIS.

But can not export the data to csv file corecttly without errror,
just because the character which is not SJIS encoding.

The main problem is that , I want to ignore the none SJIS character,
but I do not know how todo it ... :(
Such as iconv function in php
-----------------------------
If you append the string //TRANSLIT to out_charset transliteration is
activated. This means that when a character can't be represented in
the target charset, it can be approximated through one or several
similarly looking characters. If you append the string //IGNORE,
characters that cannot be represented in the target charset are
silently discarded. Otherwise, str is cut from the first illegal
character.
-----------------------------

Of course it is not the best way ,
but I just want to resolve this problem immediately


Thanks

2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
> Oops again..
>
> or
> --
> set PGCLIENTENCODING=SJIS
> pg_dump
> or
> copy of psql's.
> --
> However, an environmental setup is as follows again.
>
> export PGCLIENTENCODING=SJIS
> :-)
>
> Regards,
> Hiroshi Saito
>
> ----- Original Message -----
> From: "Hiroshi Saito" <z-saito@guitar.ocn.ne.jp>
>
>
> > Hi.
> >
> > ----- Original Message -----
> > From: "bh yuan" <bhyuan@gmail.com>
> >
> >
> >> Thanks for your replay.
> >>
> >> The  "0xe9ab99" is not SJIS nor SHIFT_JIS_2004.
> >
> > Ahh Ok, you already understood.  :-)
> >
> >> But I shoud export data with not regular SJIS character
> >> from old database(7.4.3) to new database(8.3),
> >> and use the old programe which export data as SJIS encoding .csv file.
> >> Can I modify conf file to ignore the error?
> >> or check inigore character from the database and convert it to regular
> >> SJIS encoding character with some tools ?
> >
> > Although it is unknown in whether the inside of .csv file which you acquired
> > is SJIS...However, as one method,
> > If you were operating by SERVER_ENCODING of UTF-8, it will be able to
> > bring without conversion. Then, an environment variable will be helpful.
> >
> > --
> > set PGCLIENTENCODING=UTF-8
> > pg_dump
> > --
> >
> > Regards,
> > Hiroshi Saito
> >
> >>
> >> And
> >>> >> to export data as .csv file,
> >>> >> I use  set client_encoding='SJIS' at client.
> >>> >
> >>> > No, you should use UTF-8 of default.
> >> means export a UTF-8 encoding csv file?
> >>
> >> Thanks
> >>
> >> 2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
> >>> Ooops, shortage of information..sorry.
> >>> Please see,
> >>> http://winpg.jp/~saito/pg83/HASHIGODATA/
> >>>
> >>> > Hi.
> >>> >
> >>> > ----- Original Message -----
> >>> > From: "bh yuan" <bhyuan@gmail.com>
> >>> >
> >>> >> hi
> >>> >>
> >>> >> I used Postgresql7.4.3 with php for more than 3years.
> >>> >> Now I want to change my database to Postgresql8.3.
> >>> >> But I occur such problem
> >>> >> ----------------------------------------------------------
> >>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> >>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> >>> >> "SHIFT_JIS_2004"
> >>> >> ----------------------------------------------------------
> >>> >> The database was encoded by UTF-8,
> >>> >
> >>> > It SERVER_ENCODING=UTF-8 is Ok.
> >>> >
> >>> >> to export data as .csv file,
> >>> >> I use  set client_encoding='SJIS' at client.
> >>> >
> >>> > No, you should use UTF-8 of default.
> >>> >
> >>> >> When I use Postgresql7.4.3,no problem occur,
> >>> >
> >>> > It seems that it has loose check....
> >>> >
> >>> >> but after I chaged to Postgresql8.3 ,the error was occured.
> >>> >>
> >>> >> Can I ignore the error message ?
> >>> >> or any othe method to solve this problem.
> >>> >
> >>> > "0xe9ab99" which you use is famous UNICODE.
> >>> > Then, An error is right. (not SJIS)
> >>> >
> >>> > Regards,
> >>> > Hiroshi Saito
> >>>
> >>
> >> ---------------------------(end of broadcast)---------------------------
> >> TIP 3: Have you checked our extensive FAQ?
> >>
> >>               http://www.postgresql.org/docs/faq
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >       subscribe-nomail command to majordomo@postgresql.org so that your
> >       message can get through to the mailing list cleanly
>

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"Hiroshi Saito"
Date:
Hi.

Oh ok,
About the data in a database.
From the pre- report, you are operating by UTF-8. Therfore,
you should take it out by UTF-8 first. It is reload to new 8.3 of UTF-8.
Then, conversion does not occur there.

To the next.
when SJIS is required by the client , some character codes may not have
a conversion table. Then, It will catch with another problem.
It is necessary to analyze it in detail. However, A database can be operated
normally.

Regards,
Hiroshi Saito

----- Original Message -----
From: "bh yuan" <bhyuan@gmail.com>


>I think i can export olddb(7.4) by encoding SJIS,
> but can not import it into new db(8.3) for some unregular sjis character.
>
> And
> I create the database by encoding UTF-8 .
> ---
> createdb -E UNICODE db
> ---
> I think it is the same to set PGCLIENTENCODING=UTF-8
> .
> use set client_encoding=''SJIS be the same to export PGCLIENTENCODING=SJIS.
>
> But can not export the data to csv file corecttly without errror,
> just because the character which is not SJIS encoding.
>
> The main problem is that , I want to ignore the none SJIS character,
> but I do not know how todo it ... :(
> Such as iconv function in php
> -----------------------------
> If you append the string //TRANSLIT to out_charset transliteration is
> activated. This means that when a character can't be represented in
> the target charset, it can be approximated through one or several
> similarly looking characters. If you append the string //IGNORE,
> characters that cannot be represented in the target charset are
> silently discarded. Otherwise, str is cut from the first illegal
> character.
> -----------------------------
>
> Of course it is not the best way ,
> but I just want to resolve this problem immediately
>
>
> Thanks
>
> 2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
>> Oops again..
>>
>> or
>> --
>> set PGCLIENTENCODING=SJIS
>> pg_dump
>> or
>> copy of psql's.
>> --
>> However, an environmental setup is as follows again.
>>
>> export PGCLIENTENCODING=SJIS
>> :-)
>>
>> Regards,
>> Hiroshi Saito
>>
>> ----- Original Message -----
>> From: "Hiroshi Saito" <z-saito@guitar.ocn.ne.jp>
>>
>>
>> > Hi.
>> >
>> > ----- Original Message -----
>> > From: "bh yuan" <bhyuan@gmail.com>
>> >
>> >
>> >> Thanks for your replay.
>> >>
>> >> The  "0xe9ab99" is not SJIS nor SHIFT_JIS_2004.
>> >
>> > Ahh Ok, you already understood.  :-)
>> >
>> >> But I shoud export data with not regular SJIS character
>> >> from old database(7.4.3) to new database(8.3),
>> >> and use the old programe which export data as SJIS encoding .csv file.
>> >> Can I modify conf file to ignore the error?
>> >> or check inigore character from the database and convert it to regular
>> >> SJIS encoding character with some tools ?
>> >
>> > Although it is unknown in whether the inside of .csv file which you acquired
>> > is SJIS...However, as one method,
>> > If you were operating by SERVER_ENCODING of UTF-8, it will be able to
>> > bring without conversion. Then, an environment variable will be helpful.
>> >
>> > --
>> > set PGCLIENTENCODING=UTF-8
>> > pg_dump
>> > --
>> >
>> > Regards,
>> > Hiroshi Saito
>> >
>> >>
>> >> And
>> >>> >> to export data as .csv file,
>> >>> >> I use  set client_encoding='SJIS' at client.
>> >>> >
>> >>> > No, you should use UTF-8 of default.
>> >> means export a UTF-8 encoding csv file?
>> >>
>> >> Thanks
>> >>
>> >> 2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
>> >>> Ooops, shortage of information..sorry.
>> >>> Please see,
>> >>> http://winpg.jp/~saito/pg83/HASHIGODATA/
>> >>>
>> >>> > Hi.
>> >>> >
>> >>> > ----- Original Message -----
>> >>> > From: "bh yuan" <bhyuan@gmail.com>
>> >>> >
>> >>> >> hi
>> >>> >>
>> >>> >> I used Postgresql7.4.3 with php for more than 3years.
>> >>> >> Now I want to change my database to Postgresql8.3.
>> >>> >> But I occur such problem
>> >>> >> ----------------------------------------------------------
>> >>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
>> >>> >> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
>> >>> >> "SHIFT_JIS_2004"
>> >>> >> ----------------------------------------------------------
>> >>> >> The database was encoded by UTF-8,
>> >>> >
>> >>> > It SERVER_ENCODING=UTF-8 is Ok.
>> >>> >
>> >>> >> to export data as .csv file,
>> >>> >> I use  set client_encoding='SJIS' at client.
>> >>> >
>> >>> > No, you should use UTF-8 of default.
>> >>> >
>> >>> >> When I use Postgresql7.4.3,no problem occur,
>> >>> >
>> >>> > It seems that it has loose check....
>> >>> >
>> >>> >> but after I chaged to Postgresql8.3 ,the error was occured.
>> >>> >>
>> >>> >> Can I ignore the error message ?
>> >>> >> or any othe method to solve this problem.
>> >>> >
>> >>> > "0xe9ab99" which you use is famous UNICODE.
>> >>> > Then, An error is right. (not SJIS)
>> >>> >
>> >>> > Regards,
>> >>> > Hiroshi Saito
>> >>>
>> >>
>> >> ---------------------------(end of broadcast)---------------------------
>> >> TIP 3: Have you checked our extensive FAQ?
>> >>
>> >>               http://www.postgresql.org/docs/faq
>> >
>> > ---------------------------(end of broadcast)---------------------------
>> > TIP 1: if posting/reading through Usenet, please send an appropriate
>> >       subscribe-nomail command to majordomo@postgresql.org so that your
>> >       message can get through to the mailing list cleanly
>>

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"Hiroshi Saito"
Date:
Hi.

>> use set client_encoding=''SJIS be the same to export PGCLIENTENCODING=SJIS.
>>
>> But can not export the data to csv file corecttly without errror,
>> just because the character which is not SJIS encoding.

Um, Please show the information on your database of 7.3.

ex)
postgres=# \l
        List of databases
   Name    |  Owner   | Encoding
-----------+----------+----------
 postgres  | postgres | UTF8
 template0 | postgres | UTF8
 template1 | postgres | UTF8
(3 rows)

Probably, your database is SQL_ASCII ?

Regards,
Hiroshi Saito

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"bh yuan"
Date:
Hi

By use # \l ,the 7.4.3 version database show
        List of databases
   Name    |  Owner   | Encoding
-----------+----------+-----------
 testdbxx   | userxxx | UNICODE
 template0 | postgres | SQL_ASCII
 template1 | postgres | SQL_ASCII
(3 rows)

I think [some character codes may not have a conversion table] is the reasion.
Now I occour 「〜」(0xefbd9e)、「―」(0xe28095)、「郄」(0xe9ab99) can not be
converted to SJIS without error message.

I convert the character to another SJIS character
by UPDATE tablexx SET fieldxx=replace(fieldxx,'\xef\xbd\x9e','~')
then I can export it as SJIS CSV file.
But it is not good idea, maybe I can config the setting file of postgresql8.3
or change the conversion table by myself  ?

Thanks

2008/2/8, Hiroshi Saito <z-saito@guitar.ocn.ne.jp>:
> Hi.
>
> >> use set client_encoding=''SJIS be the same to export PGCLIENTENCODING=SJIS.
> >>
> >> But can not export the data to csv file corecttly without errror,
> >> just because the character which is not SJIS encoding.
>
> Um, Please show the information on your database of 7.3.
>
> ex)
> postgres=# \l
>         List of databases
>    Name    |  Owner   | Encoding
> -----------+----------+----------
>  postgres  | postgres | UTF8
>  template0 | postgres | UTF8
>  template1 | postgres | UTF8
> (3 rows)
>
> Probably, your database is SQL_ASCII ?
>
> Regards,
> Hiroshi Saito
>

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
Alvaro Herrera
Date:
bh yuan escribió:

> I think [some character codes may not have a conversion table] is the reasion.
> Now I occour 「〜」(0xefbd9e)、「―」(0xe28095)、「?b!W(0xe9ab99) can not be
> converted to SJIS without error message.
>
> I convert the character to another SJIS character
> by UPDATE tablexx SET fieldxx=replace(fieldxx,'\xef\xbd\x9e','~')
> then I can export it as SJIS CSV file.
> But it is not good idea, maybe I can config the setting file of postgresql8.3
> or change the conversion table by myself  ?

I guess you can change the conversion table yourself -- see
src/backend/utils/mb/Unicode.  I think you would have to edit the
sjis-0213-2004-std.txt file to add those characters, then run
UCS_to_SHIFT_JIS_2004.pl to generate the updated .map file, then
regenerate the shared lib at
src/backend/utils/mb/conversion_procs/utf8_and_shift_jis_2004, and
reinstall it.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
Tatsuo Ishii
Date:
> hi
>
> I used Postgresql7.4.3 with php for more than 3years.
> Now I want to change my database to Postgresql8.3.
> But I occur such problem
> ----------------------------------------------------------
> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> "SHIFT_JIS_2004"
> ----------------------------------------------------------
> The database was encoded by UTF-8,
> to export data as .csv file,
> I use  set client_encoding='SJIS' at client.
> When I use Postgresql7.4.3,no problem occur,
> but after I chaged to Postgresql8.3 ,the error was occured.
>
> Can I ignore the error message ?
> or any othe method to solve this problem.

First of all, you should aware that SHIFT_JIS_2004 is a comppletely
different beast from SJIS. If you want to continue to use SJIS data in
7.4, you must use SJIS, not SHIFT_JIS_2004 on 8.3. Or do you have any
particular reason to use SHIFT_JIS_2004?

BTW,

> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"

I don't see this error message with PostgreSQL 8.3.0 running on a
Linux box. I can store UTF-8 0xe9ab99 (== U+9AD9) and retrieve it from
the SJIS client side (0xe9ab99 corresponds to 0xfbfc). Actually we can
confirm this by looking at line 6914 in
src/backend/utils/mb/Unicode/utf8_to_sjis.map:

  {0xe9ab99, 0xfbfc},

Note that the left is the value for UTF-8, and the right side the
value for SJIS. I recommend you to double check your PostgreSQL 8.3
installation.

For your convenience, I have attatched a dump containing a table
(called "t1") which has the UTF-8 character in question.

$ createdb -E UTF_8 test
$ gunzip -c /tmp/t1.dump.gz|psql test
$ psql -c "set client_encoding to SJIS;select * from t1" test
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Attachment

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
Tatsuo Ishii
Date:
> hi
>
> I used Postgresql7.4.3 with php for more than 3years.
> Now I want to change my database to Postgresql8.3.
> But I occur such problem
> ----------------------------------------------------------
> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> "SHIFT_JIS_2004"
> ----------------------------------------------------------
> The database was encoded by UTF-8,
> to export data as .csv file,
> I use  set client_encoding='SJIS' at client.
> When I use Postgresql7.4.3,no problem occur,
> but after I chaged to Postgresql8.3 ,the error was occured.
>
> Can I ignore the error message ?
> or any othe method to solve this problem.

First of all, you should aware that SHIFT_JIS_2004 is a comppletely
different beast from SJIS. If you want to continue to use SJIS data in
7.4, you must use SJIS, not SHIFT_JIS_2004 on 8.3. Or do you have any
particular reason to use SHIFT_JIS_2004?

BTW,

> ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"

I don't see this error message with PostgreSQL 8.3.0 running on a
Linux box. I can store UTF-8 0xe9ab99 (== U+9AD9) and retrieve it from
the SJIS client side (0xe9ab99 corresponds to 0xfbfc). Actually we can
confirm this by looking at line 6914 in
src/backend/utils/mb/Unicode/utf8_to_sjis.map:

  {0xe9ab99, 0xfbfc},

Note that the left is the value for UTF-8, and the right side the
value for SJIS. I recommend you to double check your PostgreSQL 8.3
installation.

For your convenience, I have attatched a dump containing a table
(called "t1") which has the UTF-8 character in question.

$ createdb -E UTF_8 test
$ gunzip -c /tmp/t1.dump.gz|psql test
$ psql -c "set client_encoding to SJIS;select * from t1" test
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Attachment

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
"bh yuan"
Date:
SHIFT_JIS_2004  is different to SJIS.
But when I use SJIS, I occur the same problem,
so I try SHIFT_JIS_2004.

=> set client_encoding='SJIS';
SET
=> select * from tablexx;
ERROR:  character 0xc2a0 of encoding "UTF8" has no equivalent in "SJIS"

too confused...

Thanks


2008/2/13, Tatsuo Ishii <ishii@postgresql.org>:
> > hi
> >
> > I used Postgresql7.4.3 with php for more than 3years.
> > Now I want to change my database to Postgresql8.3.
> > But I occur such problem
> > ----------------------------------------------------------
> > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> > "SHIFT_JIS_2004"
> > ----------------------------------------------------------
> > The database was encoded by UTF-8,
> > to export data as .csv file,
> > I use  set client_encoding='SJIS' at client.
> > When I use Postgresql7.4.3,no problem occur,
> > but after I chaged to Postgresql8.3 ,the error was occured.
> >
> > Can I ignore the error message ?
> > or any othe method to solve this problem.
>
> First of all, you should aware that SHIFT_JIS_2004 is a comppletely
> different beast from SJIS. If you want to continue to use SJIS data in
> 7.4, you must use SJIS, not SHIFT_JIS_2004 on 8.3. Or do you have any
> particular reason to use SHIFT_JIS_2004?
>
> BTW,
>
> > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
>
> I don't see this error message with PostgreSQL 8.3.0 running on a
> Linux box. I can store UTF-8 0xe9ab99 (== U+9AD9) and retrieve it from
> the SJIS client side (0xe9ab99 corresponds to 0xfbfc). Actually we can
> confirm this by looking at line 6914 in
> src/backend/utils/mb/Unicode/utf8_to_sjis.map:
>
>  {0xe9ab99, 0xfbfc},
>
> Note that the left is the value for UTF-8, and the right side the
> value for SJIS. I recommend you to double check your PostgreSQL 8.3
> installation.
>
> For your convenience, I have attatched a dump containing a table
> (called "t1") which has the UTF-8 character in question.
>
> $ createdb -E UTF_8 test
> $ gunzip -c /tmp/t1.dump.gz|psql test
> $ psql -c "set client_encoding to SJIS;select * from t1" test
> --
> Tatsuo Ishii
> SRA OSS, Inc. Japan
>
>

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
Tatsuo Ishii
Date:
I don't see any strange thing.

There has been no mapping from UTF-8 0xc2a0 to SJIS in PostgreSQL
since the day one. That means, you should get the error on 7.4.3, as
well as on 8.3. Are you sure that you don't have the error on 7.4.3?
--
Tatsuo Ishii
SRA OSS, Inc. Japan

> SHIFT_JIS_2004  is different to SJIS.
> But when I use SJIS, I occur the same problem,
> so I try SHIFT_JIS_2004.
>
> => set client_encoding='SJIS';
> SET
> => select * from tablexx;
> ERROR:  character 0xc2a0 of encoding "UTF8" has no equivalent in "SJIS"
>
> too confused...
>
> Thanks
>
>
> 2008/2/13, Tatsuo Ishii <ishii@postgresql.org>:
> > > hi
> > >
> > > I used Postgresql7.4.3 with php for more than 3years.
> > > Now I want to change my database to Postgresql8.3.
> > > But I occur such problem
> > > ----------------------------------------------------------
> > > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> > > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> > > "SHIFT_JIS_2004"
> > > ----------------------------------------------------------
> > > The database was encoded by UTF-8,
> > > to export data as .csv file,
> > > I use  set client_encoding='SJIS' at client.
> > > When I use Postgresql7.4.3,no problem occur,
> > > but after I chaged to Postgresql8.3 ,the error was occured.
> > >
> > > Can I ignore the error message ?
> > > or any othe method to solve this problem.
> >
> > First of all, you should aware that SHIFT_JIS_2004 is a comppletely
> > different beast from SJIS. If you want to continue to use SJIS data in
> > 7.4, you must use SJIS, not SHIFT_JIS_2004 on 8.3. Or do you have any
> > particular reason to use SHIFT_JIS_2004?
> >
> > BTW,
> >
> > > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> >
> > I don't see this error message with PostgreSQL 8.3.0 running on a
> > Linux box. I can store UTF-8 0xe9ab99 (== U+9AD9) and retrieve it from
> > the SJIS client side (0xe9ab99 corresponds to 0xfbfc). Actually we can
> > confirm this by looking at line 6914 in
> > src/backend/utils/mb/Unicode/utf8_to_sjis.map:
> >
> >  {0xe9ab99, 0xfbfc},
> >
> > Note that the left is the value for UTF-8, and the right side the
> > value for SJIS. I recommend you to double check your PostgreSQL 8.3
> > installation.
> >
> > For your convenience, I have attatched a dump containing a table
> > (called "t1") which has the UTF-8 character in question.
> >
> > $ createdb -E UTF_8 test
> > $ gunzip -c /tmp/t1.dump.gz|psql test
> > $ psql -c "set client_encoding to SJIS;select * from t1" test
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> >
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

Re: character conversion problem about UTF-8-->SHIFT_JIS_2004

From
Tatsuo Ishii
Date:
> Thanks for your reply.
> I think ther are no error in 7.4.3 but warning.

That means the character in question was ignored in 7.4, i.e. the
character was skipped. I'm not sure that's actually what you want.

> I used the old version 7.4.3 postgresql for 3 years with
> UTF-8 encoding web base frontend.
> Without serious encoding check, user can input not only SJIS character
> but also UTF-8 character freely.
> At the age of 7.4.3,I can export the data as SJIS withou error.
> Such as
> --
> set client_encoding='SJIS';
> select xxx from xxx ...
> --
> But after I update the database from 7.4.3 to 8.3 I occur the error
> --
> ERROR:  character 0xc2a0 of encoding "UTF8" has no equivalent in "SJIS"
> ERROR:  character 0xe29890 of encoding "UTF8" has no equivalent in "SJIS"
> ERROR:  character 0xe998b3 of encoding "UTF8" has no equivalent in "SJIS"
> --
> I had try to modify the conversion map utf8_to_sjis.map ,
> but the user also input some chinese character to the database,
> I had to give up.
> So I crack the
> /postgresql-8.3.0/src/backend/utils/mb/conv.c to avoid the problem.

You are risiking the SQL injection attack by the modification.

> function UtfToLocal  . line 468
>             /*old code
>             if (p == NULL)
>                 report_untranslatable_char(PG_UTF8, encoding,
>                                            (const char *) (utf - l), len);
>             code = p->code;
>             */
>             if (p == NULL) {
>                 //WARNING not ERROR
>                 ereport(WARNING,
>                         (errcode(ERRCODE_UNTRANSLATABLE_CHARACTER),
>                           errmsg("Ignoring : character 0x%s of encoding \"%s\" has no
> equivalent in \"%s\"",
>                                  utf,
>                                  pg_enc2name_tbl[PG_UTF8].name,
>                                  pg_enc2name_tbl[encoding].name)));
>                 continue;
> --
> I do not know it is right or not even though I can compile it and
> install it correctly.
> Can anybody help me to check the file or any ieda.
>
> 3 source file was attached
> conv.8.3.c -- 8.3 original conv.c file
> conv.7.4.3.c -- 7.4.3 original conv.c file
> conv.c    -- cracked file
>
> Thanks
>
> 2008/2/15, Tatsuo Ishii <ishii@postgresql.org>:
> > I don't see any strange thing.
> >
> > There has been no mapping from UTF-8 0xc2a0 to SJIS in PostgreSQL
> > since the day one. That means, you should get the error on 7.4.3, as
> > well as on 8.3. Are you sure that you don't have the error on 7.4.3?
> > --
> > Tatsuo Ishii
> > SRA OSS, Inc. Japan
> >
> > > SHIFT_JIS_2004  is different to SJIS.
> > > But when I use SJIS, I occur the same problem,
> > > so I try SHIFT_JIS_2004.
> > >
> > > => set client_encoding='SJIS';
> > > SET
> > > => select * from tablexx;
> > > ERROR:  character 0xc2a0 of encoding "UTF8" has no equivalent in "SJIS"
> > >
> > > too confused...
> > >
> > > Thanks
> > >
> > >
> > > 2008/2/13, Tatsuo Ishii <ishii@postgresql.org>:
> > > > > hi
> > > > >
> > > > > I used Postgresql7.4.3 with php for more than 3years.
> > > > > Now I want to change my database to Postgresql8.3.
> > > > > But I occur such problem
> > > > > ----------------------------------------------------------
> > > > > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> > > > > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in
> > > > > "SHIFT_JIS_2004"
> > > > > ----------------------------------------------------------
> > > > > The database was encoded by UTF-8,
> > > > > to export data as .csv file,
> > > > > I use  set client_encoding='SJIS' at client.
> > > > > When I use Postgresql7.4.3,no problem occur,
> > > > > but after I chaged to Postgresql8.3 ,the error was occured.
> > > > >
> > > > > Can I ignore the error message ?
> > > > > or any othe method to solve this problem.
> > > >
> > > > First of all, you should aware that SHIFT_JIS_2004 is a comppletely
> > > > different beast from SJIS. If you want to continue to use SJIS data in
> > > > 7.4, you must use SJIS, not SHIFT_JIS_2004 on 8.3. Or do you have any
> > > > particular reason to use SHIFT_JIS_2004?
> > > >
> > > > BTW,
> > > >
> > > > > ERROR: character 0xe9ab99 of encoding "UTF8" has no equivalent in "SJIS"
> > > >
> > > > I don't see this error message with PostgreSQL 8.3.0 running on a
> > > > Linux box. I can store UTF-8 0xe9ab99 (== U+9AD9) and retrieve it from
> > > > the SJIS client side (0xe9ab99 corresponds to 0xfbfc). Actually we can
> > > > confirm this by looking at line 6914 in
> > > > src/backend/utils/mb/Unicode/utf8_to_sjis.map:
> > > >
> > > >  {0xe9ab99, 0xfbfc},
> > > >
> > > > Note that the left is the value for UTF-8, and the right side the
> > > > value for SJIS. I recommend you to double check your PostgreSQL 8.3
> > > > installation.
> > > >
> > > > For your convenience, I have attatched a dump containing a table
> > > > (called "t1") which has the UTF-8 character in question.
> > > >
> > > > $ createdb -E UTF_8 test
> > > > $ gunzip -c /tmp/t1.dump.gz|psql test
> > > > $ psql -c "set client_encoding to SJIS;select * from t1" test
> > > > --
> > > > Tatsuo Ishii
> > > > SRA OSS, Inc. Japan
> > > >
> > > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: explain analyze is your friend
> >