Thread: a "catch all" type ... such a thing?
Are there any data types that can hold pretty much any type of character? UTF-16 isn't supported (or its missing from teh docs), and UTF-8 doesn't appear to have a big enough range ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Marc G. Fournier wrote: > > Are there any data types that can hold pretty much any type of > character? UTF-16 isn't supported (or its missing from teh docs), and > UTF-8 doesn't appear to have a big enough range ... PLEASE Note: type of caracter is generally not a matter of _datatype_
On Sep 9, 2005, at 3:09 PM, Eugene E. wrote: > Marc G. Fournier wrote: > >> Are there any data types that can hold pretty much any type of >> character? UTF-16 isn't supported (or its missing from teh docs), >> and UTF-8 doesn't appear to have a big enough range ... >> > > PLEASE Note: type of caracter is generally not a matter of _datatype_ That said, perhaps BYTEA would work. Not exactly the same as some kind of text string though, as you could only use the BYTEA functions for data manipulation. The SQLASCII encoding is *very* accepting, but has its own issues, which a look in the archives will provide more info. Michael Glaesemann grzm myrealbox com
Marc G. Fournier wrote: > Are there any data types that can hold pretty much any type of > character? UTF-16 isn't supported (or its missing from teh docs), and > UTF-8 doesn't appear to have a big enough range ... UTF-8 has exactly the same "range" as UTF-16. In any case, the UTF-8 encoding in PostgreSQL is probably your best choice, unless you want to dig into the weirdness that is MULE_INTERNAL. -- Peter Eisentraut http://developer.postgresql.org/~petere/
On Saturday 2005-09-10 11:43, Peter Eisentraut wrote: > Marc G. Fournier wrote: > > Are there any data types that can hold pretty much any type of > > character? UTF-16 isn't supported (or its missing from teh docs), and > > UTF-8 doesn't appear to have a big enough range ... > > UTF-8 has exactly the same "range" as UTF-16. In any case, the UTF-8 > encoding in PostgreSQL is probably your best choice, unless you want to > dig into the weirdness that is MULE_INTERNAL. The 8.1 beta documentation says that UTF-8 in earlier versions of Pg only covered the first 16 bits of Unicode. Unfortunately "pure" Unicode uses 32 bits and (according to my Unicode Demystified) needed at least 21 (?) bits to represent all the code points available in Unicode 3.x. (I think Unicode is now in 4.x.) This means that the code space supported by Pg 8.0 is technically too small. It shouldn't matter though, unless you are working with Chinese or a private character set.