Re: Second byte of multibyte characters causing trouble - Mailing list pgsql-general
From | Tatsuo Ishii |
---|---|
Subject | Re: Second byte of multibyte characters causing trouble |
Date | |
Msg-id | 20010922194800Y.t-ishii@sra.co.jp Whole thread Raw |
In response to | Re: Second byte of multibyte characters causing trouble ("Karen Ellrick" <k-ellrick@sctech.co.jp>) |
Responses |
Re: Second byte of multibyte characters causing trouble
|
List | pgsql-general |
> Now first I have to convert my existing data, which although sitting in a > database that expects EUC, is actually SJIS-based text. I found the > following series of bash commands in a Japanese mailing list archive - does > it look like this will work for me? (It looks scary to just drop the whole > database and hope that the .out file knows how to rebuild it with all the > indexes, sequences, users, etc. in place - should I be nervous?) > $ pg_dump -D dbname > db.out > $ dropdb dbname > $ createdb -E EUC_JP dbname > $ export PGCLIENTENCODING=SJIS > $ psql dbname < db.out > $ export PGCLIENTENCODING=EUC_JP Yes, above procedure should convert your SJIS based database (by mistake) to EUC_JP database. > Regarding the user interface end, when I read the suggested solution of > using jcode to convert everything in and out of the database, I thought, > "That's tedious! Why not just use EUC on the web pages, and the whole > system will be in sync?" But that seems to be almost as tedious. The > Windows-based editor I normally use to input the Japanese text portions of > my code (I do most of the work in vi on my Linux box, but I can't input the > Japanese that way) You can't input Japanese using vi? Why? > reads and writes in Shift-JIS unless I use pre- and > post-processing filters, and it seems that other Windows programs also favor > Shift-JIS. Why not emacs? It can read and write SJIS texts directory. > I did a totally unofficial, very-small-data-sample survey of > Japanese web sites, and it seems that in general, sites that deal with > ordinary consumers (and likely are written on Microsoft machines) use > Shift-JIS (even ones that I figure must use databases, like search engines > and e-commerce), Linux-related sites use JIS, and PostgreSQL-related sites > use EUC. I'm sure there's a grand story to explain how it got to be this > messy, but for right now, I guess we have to live with all these different > systems - apparently there is not one system that works nicely for all > things, or else the others would gradually become obselete, right? > > Before I add jcode function calls for every piece of data I get in or out of > the database, or convert all my web page text to EUC-JP (I haven't decided > yet which approach is more work, or more of a problem to maintain), are > there any other thoughts on this? For example, does someone know of one of > the following: (a) a way to get the text-only console of a RedHat 6.1J box > to actually display Japanese characters (if so, I not only wouldn't have to > deal with the Windows box for editing, I could even read the output of > queries in psql!), Use "kon" command. > or (b) a text editor for Windows that can be configured > to default to EUC, rather than having to remember to always select a filter > to convert to and from Shift-JIS? Again why not emacs? > Or on the flip side of the discussion, > can anyone imagine pitfalls associated with having a web site that is half > EUC (the PHP and Perl files that deal with the database) and half Shift-JIS > (the static HTML pages that are written by other people in who-knows-what > Windows-based tools)? Are yo using PHP? Then I strongly recommend upgrade to PHP 4.0.6 or higher. It supports Japanese very well. It aumatically guess the input charset, does the neccessary conversion. This is very helpfull. Also I recommend that you always use EUC-JP to write PHP scripts. Assuming you could read/write Japanese, I recommend you subscribe PHP-users list (http://ns1.php.gr.jp/mailman/listinfo/php-users). -- Tatsuo Ishii
pgsql-general by date: