On Saturday 2005-09-10 11:43, Peter Eisentraut wrote:
> Marc G. Fournier wrote:
> > Are there any data types that can hold pretty much any type of
> > character? UTF-16 isn't supported (or its missing from teh docs), and
> > UTF-8 doesn't appear to have a big enough range ...
>
> UTF-8 has exactly the same "range" as UTF-16. In any case, the UTF-8
> encoding in PostgreSQL is probably your best choice, unless you want to
> dig into the weirdness that is MULE_INTERNAL.
The 8.1 beta documentation says that UTF-8 in earlier versions of Pg only
covered the first 16 bits of Unicode. Unfortunately "pure" Unicode uses 32
bits and (according to my Unicode Demystified) needed at least 21 (?) bits to
represent all the code points available in Unicode 3.x. (I think Unicode is
now in 4.x.)
This means that the code space supported by Pg 8.0 is technically too small.
It shouldn't matter though, unless you are working with Chinese or a private
character set.