Re: Collation version tracking for macOS - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Collation version tracking for macOS
Date
Msg-id CA+hUKGLv2kK4KnAhtFPDpdjjCS6vcdcBq9Gk02cRJb85zqjuSw@mail.gmail.com
Whole thread Raw
In response to Re: Collation version tracking for macOS  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Tue, Feb 13, 2024 at 9:25 AM Jeff Davis <pgsql@j-davis.com> wrote:
> On Sun, 2024-02-11 at 22:04 +0530, Robert Haas wrote:
> > "icu_multilib must be loaded via shared_preload_libraries.
> > icu_multilib ignores any ICU library with a major version greater
> > than
> > that with which PostgreSQL was built."
> >
> > It's not clear from reading this whether the second sentence here is
> > a
> > regrettable implementation restriction or design behavior. If it's
> > design behavior, what's the point of it?
>
> That restriction came from Thomas's (uncommitted) work on the same
> problem. I believe the reasoning was that we don't know whether future
> versions of ICU might break something that we're doing, though perhaps
> there's a better way.

Right, to spell that out more fully:  We compile and link against one
particular ICU library that is present at compile time, and there is a
place in that multi-lib patch that assigns the function pointers from
that version to variables of the function pointer type that we expect.
Compilation would fail if ICU ever changed relevant function
prototypes in a future release, and then we'd have to come up with
some trampoline/wrapper scheme to wallpaper over differences.  That's
why I think it's safe to use dlsym() to access function pointers for
versions up to and including the one whose headers we were compiled
against, but not later ones which we haven't tested in that way.

Sadly I won't be able to work on multi-lib ICU support again in this
cycle.  I think we managed to prove that dlopen works for this, and
learn some really interesting stuff about Unicode and ICU evolution,
but we still have to come up with the right model, catalogues and DDL
etc, for a nice user experience.  What I was most recently
experimenting with based on earlier discussions was the idea of
declaring separate providers: icu72 and icu68 could both exist and you
could create extra indexes and then drop the originals as a
no-downtime upgrade path.  I have a pet theory that you could usefully
support multi-version libc locales too if you're prepared to make
certain assumptions (short version: take the collation definition
files from any older version of your OS, compile with newer version's
localedef, give it a name like "en_US@ubuntu18", and assume/pray they
didn't change stuff that wasn't expressed in the definition file), so
I was working on a generalisation slightly wider than just
multi-version ICU.



pgsql-hackers by date:

Previous
From: Melanie Plageman
Date:
Subject: BitmapHeapScan streaming read user and prelim refactoring
Next
From: Tatsuo Ishii
Date:
Subject: Re: When extended query protocol ends?