On 2/16/23 4:35 AM, Robert Haas wrote:
> On Thu, Feb 16, 2023 at 1:01 AM Jeff Davis <pgsql@j-davis.com> wrote:
>> It feels very wrong to me to explain that sort order is defined by the
>> operating system on which Postgres happens to run. Saying that it's
>> defined by ICU, which is part of the Unicode consotium, is much better.
>> It doesn't eliminate versioning issues, of course, but I think it's a
>> better explanation for users.
>
> The fact that we can't use ICU on Windows, though, weakens this
> argument a lot. In my experience, we have a lot of Windows users, and
> they're not any happier with the operating system collations than
> Linux users. Possibly less so.
This is one reason why we're discussing ICU as the "preferred default"
vs. "the default." While it may not completely eliminate platform
dependent behavior for collations, it takes a step forward.
And AIUI, it does sound like ICU is available on newer versions of
Windows[1].
> I feel like this is a very difficult kind of change to judge. If
> everyone else feels this is a win, we should go with it, and hopefully
> we'll end up better off. I do feel like there are things that could go
> wrong, though, between the imperfect documentation, the fact that a
> substantial chunk of our users won't be able to use it because they
> run Windows, and everybody having to adjust to the behavior change.
We should continue to improve our documentation. Personally, I found the
biggest challenge was understanding how to set ICU locales / rules,
particularly for nondeterministic collations as it was challenging to
find where these were listed. I was able to overcome this with the
examples in our docs + blogs, but I agree it's an area we can continue
to improve upon.
Thanks,
Jonathan
[1]
https://learn.microsoft.com/en-us/dotnet/core/extensions/globalization-icu