Re: Collation version tracking for macOS - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Collation version tracking for macOS
Date
Msg-id CA+hUKG+j7BmjJ9uLELnHh2D_-nzds5yfZWbKpEw_BkiUw3pzKQ@mail.gmail.com
Whole thread Raw
In response to Re: Collation version tracking for macOS  (Jeremy Schneider <schneider@ardentperf.com>)
List pgsql-hackers
On Wed, Jun 15, 2022 at 7:10 AM Jeremy Schneider
<schneider@ardentperf.com> wrote:
> > On Jun 14, 2022, at 14:10, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
> > Conversely, why are we looking at the ICU version instead of the collation version.  If we have recorded the
collationas being version 1234, we need to look through the available ICU versions (assuming we can load multiple ones
somehow)and pick the one that provides 1234.  It doesn't matter whether it's the same ICU version that the collation
wasoriginally created with, as long as the collation version stays the same. 

One difference would be the effect if ICU ever ships a minor library
version update that changes the reported collversion.

1.  With the code I proposed in my v4 patch, our version mismatch
warnings would kick in, but otherwise everything would continue to
work (and corrupt indexes, if they really moved anything around).
2.  With a system that (somehow) opens all available libraries and
looks for match, it would fail to find one.  That is assuming that you
are using the typical major-versioned packages we can see in software
distributions like Debian.

I don't know if minor version changes actually do that, though have
wondered out loud a few times in these threads.  I might go and poke
at some ancient packages to see if that's happened before.  To defend
against that, we could instead do major + minor versioning, but so far
I worried about major only because that's they way they ship 'em in
Debian and (AFAICS) RHEL etc, so if you can't easily install 68.0 and
68.1 at the same time.  On the other hand, you could always "pin" (or
similar concepts) the libicu68 package to a specific minor release, to
fix the problem (whether you failed like 1 or like 2 above).

> (Common mistake I’ve seen folks make when comparing OS glibc versions is only looking at locale data, not realizing
therehave been changes to root behavior that didn’t involve any changes to local data files) 

Yeah, I've wondered idly before if libc projects and ICU couldn't just
offer a way to ask for versions explicitly, and ship historical data.
With some system of symlinks to make it all work with defaults for
those who don't care, a libc could have
/usr/share/locale/en_US@CLDR34.UTF-8 etc so you could
setlocale(LC_COLLATE, "en_US@CLDR34"), or something.  I suppose they
don't want to promise to be able to interpret the old data in future
releases, and, as you say, sometimes the changes are in C code, due to
bugs or algorithm changes, not the data.



pgsql-hackers by date:

Previous
From: Pavel Borisov
Date:
Subject: Re: GIN index partial match
Next
From: Justin Pryzby
Date:
Subject: Re: pg_upgrade (12->14) fails on aggregate