Home > mailing lists

Re: [PATCHES] UNICODE characters above 0x10000 - Mailing list pgsql-hackers

From	Tatsuo Ishii
Subject	Re: [PATCHES] UNICODE characters above 0x10000
Date	August 7, 2004 07:44:30
Msg-id	20040807.194616.116347505.t-ishii@sra.co.jp Whole thread Raw
In response to	Re: [PATCHES] UNICODE characters above 0x10000 ("John Hansen" <john@geeknet.com.au>)
Responses	Re: [PATCHES] UNICODE characters above 0x10000
List	pgsql-hackers

Tree view

> Yes, but the specification allows for 6byte sequences, or 32bit
> characters.

UTF-8 is just an encoding specification, not character set
specification. Unicode only has 17 256x256 planes in its
specification.

> As dennis pointed out, just because they're not used, doesn't mean we
> should not allow them to be stored, since there might me someone using
> the high ranges for a private character set, which could very well be
> included in the specification some day.

We should expand it to 64-bit since some day the specification might
be changed then:-)

More seriously, Unicode is filled with tons of confusion and
inconsistency IMO. Remember that once Unicode adovocates said that the
merit of Unicode was it only requires 16-bit width. Now they say they
need surrogate pairs and 32-bit width chars...

Anyway my point is if current specification of Unicode only allows
24-bit range, why we need to allow usage against the specification?
--
Tatsuo Ishii

pgsql-hackers by date:

From: Gaetano Mendola
Date: 07 August 2004, 07:40:57
Subject: Re: CVS comment

From: Christopher Kings-Lynne
Date: 07 August 2004, 07:48:04
Subject: Re: UNICODE characters above 0x10000

Re: [PATCHES] UNICODE characters above 0x10000 - Mailing list pgsql-hackers

Previous

Next