Re: Unicode problems on IRC - Mailing list pgsql-hackers

From Oliver Jowett
Subject Re: Unicode problems on IRC
Date
Msg-id 4259B98B.6030509@opencloud.com
Whole thread Raw
In response to Re: Unicode problems on IRC  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:

> Yeah?  Cool.  Does John's proposed patch do it "correctly"?
> 
> http://candle.pha.pa.us/mhonarc/patches2/msg00076.html

Some comments on that patch:

Doesn't pg_utf2wchar_with_len need changes for the longer sequences?

UtfToLocal also appears to need changes.

If we support sequences >4 bytes (>U+10FFFF), then UtfToLocal/LocalToUtf
and the associated translation tables need a redesign as they currently
assume the sequence fits in an unsigned int. (IIRC, Unicode doesn't use
>U+10FFFF, but UTF-8 can encode it?)

-O


pgsql-hackers by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: System vs non-system casts
Next
From: Josh Berkus
Date:
Subject: Re: [PERFORM] Functionscan estimates