joe speigle
hello,
I am interested in this issue, too, and have lightly investigated how postgres internationalizes its applications. I
havebeen asking for help with korean. However, it has all its problems solved. I had to ask in a korean forum
http://database.sarang.net/?inc=read&aid=5368&criteria=pgsql&subcrit=qna&id=&limit=20&keyword=&page=1
the locale interacts in some way with the internationalization code,
which point I haven't gotten to understand yet. internationalization code is scattered throughout the source code.
"hackers" et. al. say that to change the way it works would be a complete database rewrite. Meaning, to change
encodingfrom database-wide and unchangeable to a column datatype.
AFAIK you can set in clients (e.g. libpq) the encoding in the connect string, then depending on what the encoding of
thedatabase is set to, it will do a conversion. If you look in your library directory, there are all kinds of *.so
whichare used for such conversions. If yours doesn't exist, you should analyze those conversions. I think the
charsetsare used, but you will have to provide yours from somewhere else. The conversion code is always
encoding-specificdepending on the ranges of values your language's atomic units take and the rules of the encoding.
I am unsure about whether or not indexing is possible with database-wide encodings. i raised that question on the
koreanforum, but received no really good answer. If you have some time, so do I, I would like to write a small source
filewhich would extract teh column information from the tuple to see in what encoding it is stored in at which point,
tosee if my above guess is right.
Can somebody tell me if it were stored in unicode, and client encoding set to utf8 or unicode, if would be possible
writedatatype as C function, to allow comparisons and indexing of the character types, and has this been attempted?
In any case, I would like to write first that test code module.
--
joe speigle
www.sirfsup.com