Re: [pgsql-packagers] Palle Girgensohn's ICU patch - Mailing list pgsql-hackers

From Palle Girgensohn
Subject Re: [pgsql-packagers] Palle Girgensohn's ICU patch
Date
Msg-id 15C9D821-9D55-4E14-8854-FA769BC7DDA6@pingpong.net
Whole thread Raw
In response to Re: [pgsql-packagers] Palle Girgensohn's ICU patch  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
> 26 nov 2014 kl. 15:21 skrev Greg Stark <stark@mit.edu>:
>
> I find it hard to believe the original premise of this thread. We knew
> there were some problems with OSX and FreeBSD but surely they can't be
> completely broken? What happens if you run "ls" with your locale set
> to something like fr_FR.UTF8 ? Does Apple not sell Macs in countries
> other than the US?

Hi,

On Mac OS X, ls -l is completely broken wrt utf-8 collation. Really. Horribly broken. The sorting it produces for the
Swedishlocale is just nonexisting, completely unaccetable, unusable. Compare it to sorting Z just after S or something,
justto get the scale of how bad it is. 

Application languages like Java have their own sorting. C based stuff like perl have their own way to do it. python,
welldepends on the version, haven't checked. C applications, well, it depends on if they use ICU or not, I guess. :) 

Apples sells computers, but does not really promote using locales in Terminal.app... :)=

>
> There were a number of problems with using ICU including the large
> dependency and the limitations of the iterator model but the main
> issue was that it's fundamentally a choice between being consistent
> with every other application on your system and being consistent with
> other Postgres databases running on other OSes. Most people run
> multiple applications on one OS, not many databases on many OSes on
> their own with no other applications. If Postgres used ICU then its
> output would be inconsistent with things like "sort" or "ls" or your
> application programming language's comparison operators.

I think most people don't care about getting postgresql collation consistent with sort or ls, they just want it to work
properlyfor real life applications, so users who really don't care about ls or sort get the result they expect. Or,
theygive up and sort it in the application instead (=fail). But I guess that depends on which applications you use.
We'veused the patch for 8+ years. For us, Linux built-in collation would not have been enough either -- if memory
servesit fails to sort 'ß' together with 'ss', and also fails to upper('ß') => 'SS', which would be expected in the
realworld. 





pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: no test programs in contrib
Next
From: Stephen Frost
Date:
Subject: Re: superuser() shortcuts