On Tue, Jun 7, 2022 at 1:24 PM Jeremy Schneider
<schneider@ardentperf.com> wrote:
> This idea does seem to persist. It's not as frequent as timezones, but
> collation rules reflect local dialects and customs, and there are
> changes quite regularly for a variety of reasons. A brief perusal of
> CLDR changelogs and CLDR jiras can give some insight here:
> Another misunderstanding that seems to persist is that this only relates
> to exotic locales or that it's only the 2.28 version.
I'm not defending the status quo, and I think that I'm better informed
than most about the problems in this area. My point was that it hardly
matters that we don't necessarily see outright corruption. This was
based in part on a misunderstanding of Tom's point, though.
> While the quality of glibc collations aren't great when compared with
> CLDR, I think the glibc maintainers have done versioning exactly right:
> they are clear about which patches are allowed to contain collation
> updates, and the OS distributions are able to ensure stability on major
> OS release. I haven't yet found a Red Hat minor release that changed
> glibc collation.
That might be true, but my impression from interacting with Carlos
O'Donnell is that they pretty much don't take the concern about
stability all that seriously. Which I think is reasonable, given his
position!
The fact that we are this sensitive to glibc collation versioning might
be a wholly unique situation (unlike with ICU, which was built with
that in mind). It might be that every other user of glibc collations
sees this as fairly inconsequential, because they don't have to deal
with persistent state that directly relies on the rules in various
ways that are critically important. Even if glibc theoretically does a
perfect job of versioning, I still think that their priorities are
very much unlike our priorities, and that that should be a relevant
consideration for us.
--
Peter Geoghegan