Re: Collation version tracking for macOS - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Collation version tracking for macOS
Date
Msg-id CA+hUKGLAAd+ftrUxJUV=Grgmf7gk0oVd_msG3oty2ufv-HVx+g@mail.gmail.com
Whole thread Raw
In response to Re: Collation version tracking for macOS  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: Collation version tracking for macOS
List pgsql-hackers
On Sat, Jun 4, 2022 at 12:17 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> On 07.05.22 02:31, Thomas Munro wrote:
> > Last time I looked into this it seemed like macOS's strcoll() gave
> > sensible answers in the traditional single-byte encodings, but didn't
> > understand UTF-8 at all so you get C/strcmp() order.  In other words
> > there was effectively nothing to version.
>
> Someone recently told me that collations in macOS have actually changed
> recently and that this is a live problem.  See explanation here:
>
> https://github.com/PostgresApp/PostgresApp/blob/master/docs/documentation/reindex-warning.md?plain=1#L66

How can I see evidence of this?  I'm comparing Debian, FreeBSD and
macOS 12.4 and when I run "LC_COLLATE=en_US.UTF-8 sort
/usr/share/dict/words" I get upper and lower case mixed together on
the other OSes, but on the Mac the upper case comes first, which is my
usual smoke test for "am I looking at binary sort order?"



pgsql-hackers by date:

Previous
From: Jeremy Schneider
Date:
Subject: Re: Collation version tracking for macOS
Next
From: Andres Freund
Date:
Subject: Re: Rewriting the test of pg_upgrade as a TAP test - take three - remastered set