Re: Distinct oddity - Mailing list pgsql-sql

From Tom Lane
Subject Re: Distinct oddity
Date
Msg-id 368.1242229676@sss.pgh.pa.us
Whole thread Raw
In response to Re: Distinct oddity  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
I wrote:
> Maximilian Tyrtania <maximilian.tyrtania@onlinehome.de> writes:
>> am 12.05.2009 19:23 Uhr schrieb Alvaro Herrera unter
>> alvherre@commandprompt.com:
>>> What platform are you using anyway?

>> Mac OS 10.4.11

> I have some vague recollection that UTF8-using locales don't actually
> work well on OSX ... check the archives ...

OK, the thread (or one of the threads) I was remembering is here:
http://archives.postgresql.org//pgsql-general/2005-11/msg00047.php

I am too lazy to boot up 10.4 right now, but looking on a 10.5.6 machine
indicates that Apple is still being pretty lame about this:

$ ls -l /usr/share/locale/de_DE 
total 40
lrwxr-xr-x  1 root  wheel   28 Feb 27  2008 LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
lrwxr-xr-x  1 root  wheel   17 Feb 27  2008 LC_CTYPE -> ../UTF-8/LC_CTYPE
drwxr-xr-x  3 root  wheel  102 Feb 27  2008 LC_MESSAGES
lrwxr-xr-x  1 root  wheel   30 Feb 27  2008 LC_MONETARY -> ../de_DE.ISO8859-1/LC_MONETARY
lrwxr-xr-x  1 root  wheel   29 Feb 27  2008 LC_NUMERIC -> ../de_DE.ISO8859-1/LC_NUMERIC
-r--r--r--  1 root  wheel  370 Jan  2  2008 LC_TIME

So it looks like they understand UTF-8 to the extent of supporting
character classification fairly well, but sort order is "just ASCII".
I'm not sure exactly how that might result in the observed odd behavior
of DISTINCT, but I bet it's causing it somehow.  You'd probably have
better luck in the de_DE.ISO8859-1 or de_DE.ISO8859-15 locales.
        regards, tom lane


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Distinct oddity
Next
From: Glenn Maynard
Date:
Subject: Re: Distinct oddity