I wrote:
> Maximilian Tyrtania <maximilian.tyrtania@onlinehome.de> writes:
>> am 12.05.2009 19:23 Uhr schrieb Alvaro Herrera unter
>> alvherre@commandprompt.com:
>>> What platform are you using anyway?
>> Mac OS 10.4.11
> I have some vague recollection that UTF8-using locales don't actually
> work well on OSX ... check the archives ...
OK, the thread (or one of the threads) I was remembering is here:
http://archives.postgresql.org//pgsql-general/2005-11/msg00047.php
I am too lazy to boot up 10.4 right now, but looking on a 10.5.6 machine
indicates that Apple is still being pretty lame about this:
$ ls -l /usr/share/locale/de_DE
total 40
lrwxr-xr-x 1 root wheel 28 Feb 27 2008 LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
lrwxr-xr-x 1 root wheel 17 Feb 27 2008 LC_CTYPE -> ../UTF-8/LC_CTYPE
drwxr-xr-x 3 root wheel 102 Feb 27 2008 LC_MESSAGES
lrwxr-xr-x 1 root wheel 30 Feb 27 2008 LC_MONETARY -> ../de_DE.ISO8859-1/LC_MONETARY
lrwxr-xr-x 1 root wheel 29 Feb 27 2008 LC_NUMERIC -> ../de_DE.ISO8859-1/LC_NUMERIC
-r--r--r-- 1 root wheel 370 Jan 2 2008 LC_TIME
So it looks like they understand UTF-8 to the extent of supporting
character classification fairly well, but sort order is "just ASCII".
I'm not sure exactly how that might result in the observed odd behavior
of DISTINCT, but I bet it's causing it somehow. You'd probably have
better luck in the de_DE.ISO8859-1 or de_DE.ISO8859-15 locales.
regards, tom lane