Re: client libpq multibyte support - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: client libpq multibyte support
Date
Msg-id 20000505171725E.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: client libpq multibyte support  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: client libpq multibyte support
List pgsql-hackers
> >  admin=# select * from SJIS_KANJI ;
> >  \: extra argument ';' ignored
> >  \: extra argument ';' ignored
> >  Invalid command \. Try \? for help.        
> 
> Ugh :-(.  We have not seen this reported before --- do you know exactly
> where it's coming from?  (I suspect it may be a psql issue not a libpq
> issue, but hard to say without more info.)

That's because none-MB client does not understand how "Shift JIS
kanji" consists of letters with different width bytes. The similar
problem would happen with the Big5 character set (traditional
Chinese), also. Unlike other character sets, these should be treated
carefully since they include the same bit patterns as ASCII and that
makes none-MB clients confused.

> I do not think that will go over well with people who don't need
> multibyte support, since the MULTIBYTE code is a good deal larger
> and slower.  Also, AFAIK we didn't have any such problem in 6.5, so
> perhaps this is just a small bug not requiring such a sledgehammer
> solution.  We need to look more closely.

No, 6.5 (and former versions) has exactly the same "bug." The reason
why you didn't hear it by now is that just nobody had tried to mixed
MB/none-MB backend/server configurations until Masaaki came up with
pgbash:-) Anyway, I could hardly imagine that such configurations
would actually exist in the real world. Masaaki, could you tell me
what are the advantages or reasons of the configuration?

For the Tom's comment of "the MULTIBYTE code is a good deal larger and
slower": IMHO it's a price of i18n (I don't claim my implementation of
MB is the most efficient one, though). Today almost any OS and
applications are evolving to be "i18n ready." Look at Lamar's new RPM. 
The multibyte and the locale functionalities are now enabled by
default in it.

In the near future, PostgreSQL would have true i18n functionalities
(NATIONAL CHARACTER and friends), and I look forward to join the work. 
I hope PostgreSQL would be i18n ready by default at that time.
--
Tatsuo Ishii


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: pg_group_name_index corrupt?
Next
From: Karel Zak
Date:
Subject: suggestion: docs and psql