Re: main log encoding problem - Mailing list pgsql-bugs

From Alexander Law
Subject Re: main log encoding problem
Date
Msg-id 5007C399.6000405@gmail.com
Whole thread Raw
In response to Re: main log encoding problem  (Tatsuo Ishii <ishii@postgresql.org>)
Responses Re: main log encoding problem
List pgsql-bugs
>> And regarding mule internal encoding - reading about Mule
>> http://www.emacswiki.org/emacs/UnicodeEncoding I found:
>> /In future (probably Emacs 22), Mule will use an internal encoding
>> which is a UTF-8 encoding of a superset of Unicode. /
>> So I still see UTF-8 as a common denominator for all the encodings.
>> I am not aware of any characters absent in Unicode. Can you please
>> provide some examples of these that can results in lossy conversion?
> You can google by "encoding "EUC_JP" has no equivalent in "UTF8"" or
> some such to find such an example. In this case PostgreSQL just throw
> an error. For frontend/backend encoding conversion this is fine. But
> what should we do for logs? Apparently we cannot throw an error here.
>
> "Unification" is another problem. Some kanji characters of CJK are
> "unified" in Unicode. The idea of unification is, if kanji A in China,
> B in Japan, C in Korea looks "similar" unify ABC to D. This is a great
> space saving:-) The price of this is inablity of
> round-trip-conversion. You can convert A, B or C to D, but you cannot
> convert D to A/B/C.
>
> BTW, I'm not stick with mule-internal encoding. What we need here is a
> "super" encoding which could include any existing encodings without
> information loss. For this purpose, I think we can even invent a new
> encoding(maybe something like very first prposal of ISO/IEC
> 10646?). However, using UTF-8 for this purpose seems to be just a
> disaster to me.
>
Ok, maybe the time of real universal encoding has not yet come. Then we
maybe just should add a new parameter "log_encoding" (UTF-8 by default)
to postgresql.conf. And to use this encoding consistently within
logging_collector.
If this encoding is not available then fall back to 7-bit ASCII.


pgsql-bugs by date:

Previous
From: Alexander Law
Date:
Subject: Re: main log encoding problem
Next
From: Tatsuo Ishii
Date:
Subject: Re: main log encoding problem