Re: Collation version tracking for macOS - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Collation version tracking for macOS
Date
Msg-id CA+TgmoaTGBSLWYHA_TPz9Fc1TOa5aZ=2yXMrGbo9xU225Nv+YA@mail.gmail.com
Whole thread Raw
In response to Re: Collation version tracking for macOS  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Collation version tracking for macOS
Re: Collation version tracking for macOS
List pgsql-hackers
On Wed, Nov 23, 2022 at 12:09 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> OK.  Time for a new list of the various models we've discussed so far:
>
> 1.  search-by-collversion:  We introduce no new "library version"
> concept to COLLATION and DATABASE object and little or no new syntax.
>
> 2.  lib-version-in-providers: We introduce a separate provider value
> for each ICU version, for example ICU63, plus an unversioned ICU like
> today.
>
> 3.  lib-version-in-attributes: We introduce daticuversion (alongside
> datcollversion) and collicuversion (alongside collversion).  Similar
> to the above, but it's a separate property and the provider is always
> ICU.  New syntax for CREATE/ALTER COLLATION/DATABASE to set and change
> ICU_VERSION.
>
> 4.  lib-version-in-locale:  "63:en" from earlier versions.  That was
> mostly a strawman proposal to avoid getting bogged down in
> syntax/catalogue/model change discussions while trying to prove that
> dlopen would even work.  It doesn't sound like anyone really likes
> this.
>
> 5.  lib-version-in-collversion:  We didn't explicitly discuss this
> before, but you hinted at it: we could just use u_getVersion() in
> [dat]collversion.

I'd like to vote against #3 at least in the form that's described
here. If we had three more libraries providing collations, it's likely
that they would need versioning, too. So if we add an explicit notion
of provider version, then it ought not to be specific to libicu.

I think it's OK to decide that different library versions are
different providers (your option #2), or that they are the same
provider but give rise to different collations (your option #4), or
that there can be multiple version of each collation which are
distinguished by some additional provider version field (your #3 made
more generic).

I don't really understand #1 or #5 well enough to have an educated
opinion, but I do think that #1 seems a bit magical. It hopes that the
combination of a collation name and a datcollversion will be
sufficient to find exactly one matcing collation in a list of provided
libraries. The advantage of that, as I understand it, is that if you
do something to your system that causes the number of matches to go
from one to zero, you can just throw another library on the pile and
get the number back up to one. Woohoo! But there's a part of me that
worries: what if the number goes up to two, and they're not all the
same? Probably that's something that shouldn't happen, but if it does
then I think there's kind of no way to fix it. With the other options,
if there's some way to jigger the catalog state to match what you want
to happen, you can always repair the situation somehow, because the
library to be used for each collation is explicitly specified in some
way, and you just have to get it to match what you want to have
happen.

I don't know too much about this, though, so I might have it all wrong.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Nikolay Shaplov
Date:
Subject: Re: TAP output format in pg_regress
Next
From: Daniel Gustafsson
Date:
Subject: Re: TAP output format in pg_regress