Re: UTF8 national character data type support WIP patch and list of open issues. - Mailing list pgsql-hackers

From MauMau
Subject Re: UTF8 national character data type support WIP patch and list of open issues.
Date
Msg-id 1191A5384BD641C68D288AF210BEFDA8@maumau
Whole thread Raw
In response to Re: UTF8 national character data type support WIP patch and list of open issues.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
From: "Tom Lane" <tgl@sss.pgh.pa.us>
> Another point to keep in mind is that UTF16 is not really any easier
> to deal with than UTF8, unless you write code that fails to support
> characters outside the basic multilingual plane.  Which is a restriction
> I don't believe we'd accept.  But without that restriction, you're still
> forced to deal with variable-width characters; and there's nothing very
> nice about the way that's done in UTF16.  So on the whole I think it
> makes more sense to use UTF8 for this.

I feel so.  I guess why Windows, Java, and Oracle chose UTF-16 is ... it was 
UCS-2 only with BMP when they chose it.  So character handling was easier 
and faster thanks to fixed-width encoding.

Regards
MauMau




pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: Re: UTF8 national character data type support WIP patch and list of open issues.
Next
From: Dimitri Fontaine
Date:
Subject: Re: record identical operator