Re: again: Bug #943: Server-Encoding from - Mailing list pgsql-hackers

From Enke, Michael
Subject Re: again: Bug #943: Server-Encoding from
Date
Msg-id 3F0D5D0D.7E83D8C8@wincor-nixdorf.com
Whole thread Raw
In response to Re: again: Bug #943: Server-Encoding from EUC_TW toUTF-8  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-hackers
Tatsuo Ishii wrote:
> 
> > I have looked into my Linux box and found this in /usr/share/i18n/charmaps/BIG5.gz:
> > % Chinese charmap for BIG5 (CP950)
> > % version: 0.92
> > % Contact: Tung-Han Hsieh   <thhsieh@linux.org.tw>
> > %          Yuan-Chung Cheng <platin@ms31.hinet.net>
> > % Distribution and use is free, even for comercial purpose.
> > %
> > % This charmap is converted from:
> > %     ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP950.TXT
> > % ...
> >
> > There "my" characters are in.
> 
> That's a M$'s definition, not a standard. I think there should be a
> reason why the Unicode org. does not use it.

Ok, I do not know the reason. But since also the glibc uses it, couldn't you use it too?
I believe the glibc delveloper have thought about this a lot. And they came to the
conclusion to use this definition. Why not postgresql?

> > Don't you agree that it is strange that I can (for EUC_TW) copy "to" file without error
> > but I can not copy "from" file without error?
> 
> I'm not quite sure what you are saying. Are you complaining that (for
> example) 0xe7a281 in UTF-8 does not convert to EUC_TW?

Yes exactly, since this value comes from a "copy to" with PGCLIENTENCODING=EUC_TW

> 
> BTW, what do you think about below?
> 
> FYI, CNS 11643-1993 is the standard character set and EUC_TW is the
> one of the encodings. That means your problem below will disappear.

Ok.

Regards,
Michael

> > > > > WARNING:  copy: line 2, LocalToUtf: could not convert (0x8ea3cfd0) EUC_TW to UTF-8. Ignored
> > > > > WARNING:  copy: line 3, LocalToUtf: could not convert (0x8ea3c4ce) EUC_TW to UTF-8. Ignored
> > > > > WARNING:  copy: line 4, LocalToUtf: could not convert (0x8ea3bdfe) EUC_TW to UTF-8. Ignored
> 
> > > > > Hum. These seem to be CNS 11643-1993, plane 3. Currently PostgreSQL
> > > > > supports only:
> > > > >
> > > > > CNS 11643-1993, plane 0
> > > > > CNS 11643-1993, plane 1
> > > > > CNS 11643-1993, plane 2
> > > > > CNS 11643-1993, plane 15
> > > > >
> > > > > Would you like to have support for rest of CNS 11643-1993 planes:
> > > > >
> > > > > CNS 11643-1993, plane 3
> > > > > CNS 11643-1993, plane 4
> > > > > CNS 11643-1993, plane 5
> > > > > CNS 11643-1993, plane 6
> > > > > CNS 11643-1993, plane 7
> > > > >
> > > > > support for upcoming 7.4?
> --
> Tatsuo Ishii


pgsql-hackers by date:

Previous
From: "Zeno R.R. Davatz"
Date:
Subject: corrupt data
Next
From: "Maksim Likharev"
Date:
Subject: Re: [GENERAL] PG crash on simple query, story continues