Torsten Zühlsdorff <foo@meisterderspiele.de> writes:
>>> # SET lc_time = "de_DE.UTF-8";
>>> # SELECT to_char('2011-03-04 00:00:01'::date, 'TMMonth YYYY');
>>> to_char
>>> -----------
>>> M�Rz 2011
>> I can reproduce the above when the database encoding is not UTF8 or
>> lc_ctype isn't a UTF8 locale.
> Hm... encoding of the database is UTF8. The lc_ctype is 'C'.
Right, that was the same case I checked. In C locale, � is not a
letter, so you get the above from the initcap transformation.
> But don't that mean, that the translation of the timestamp to languages
> with other umlauts should also be wrong. For example to "fr_FR.UTF-8"?
Possibly, I haven't checked. If they have any month names with
non-ASCII characters in the middle, they'd see the same thing.
You would certainly also get undesirable results from TMMONTH, since
it wouldn't know how to uppercase �. In my view none of this is
a Postgres bug --- the correct fix is to use locale settings that
correspond to the behavior you want.
regards, tom lane