Re: [HACKERS] Unicode combining characters - Mailing list pgsql-patches

From Tatsuo Ishii
Subject Re: [HACKERS] Unicode combining characters
Date
Msg-id 20011009231656N.t-ishii@sra.co.jp
Whole thread Raw
In response to Re: [HACKERS] Unicode combining characters  (Patrice Hédé <phede-ml@islande.org>)
Responses Re: [HACKERS] Unicode combining characters  (Patrice Hédé <phede-ml@islande.org>)
List pgsql-patches
> - corrects a bit the UTF-8 code from Tatsuo to allow Unicode 3.1
>   characters (characters with values >= 0x10000, which are encoded on
>   four bytes).

After applying your patches, do the 4-bytes UTF-8 convert to UCS-2 (2
bytes) or UCS-4 (4 bytes) in pg_utf2wchar_with_len()? If it were 4
bytes, we are in trouble. Current regex implementaion does not handle
4 byte width charsets.
--
Tatsuo Ishii

pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: typo or C&P error
Next
From: Bruce Momjian
Date:
Subject: Re: updated patch for Chinese NLS support (simplified)