Re: Collation versioning - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Collation versioning |
Date | |
Msg-id | CAEepm=04PvEdmRmCCcn4c7ydDA=-G=uLe5vDdfJiqp58Jpi8Kw@mail.gmail.com Whole thread Raw |
In response to | Re: Collation versioning (Douglas Doole <dougdoole@gmail.com>) |
Responses |
Re: Collation versioning
Re: Collation versioning |
List | pgsql-hackers |
On Tue, Sep 25, 2018 at 4:26 AM Douglas Doole <dougdoole@gmail.com> wrote:> > On Sun, Sep 23, 2018 at 2:48 PM Thomas Munro <thomas.munro@enterprisedb.com> wrote: >> Admittedly that creates a whole can >> of worms for initdb-time catalog creation, package maintainers' jobs, >> how long old versions have to be supported and how you upgraded >> database objects to new ICU versions. > > > Yep. We never come up with a good answer for that before I left IBM. At the time, DB2 only supported 2 or 3 version ofICU, so they were all shipped as part of the install bundle. > > Long term, I think the only viable approach to supporting multiple versions of ICU is runtime loading of the libraries.Then it's up to the system administrator to make sure the necessary versions are installed on the system. I wonder if we would be practically constrained to using the distro-supplied ICU (by their policies of not allowing packages to ship their own copies ICU); it seems like it. I wonder which distros allow multiple versions of ICU to be installed. I see that Debian 9.5 only has 57 in the default repo, but the major version is in the package name (what is the proper term for that kind of versioning?) and it doesn't declare a conflict with other versions, so that's promising. Poking around with nm I noticed also that both the RHEL and Debian ICU libraries have explicitly versioned symbol names like "ucol_strcollUTF8_57", which is also promising. FreeBSD seems to have used "--disable-renaming" and therefore defines only "ucol_strcollUTF8"; doh. This topic is discussed here: http://userguide.icu-project.org/design#TOC-ICU-Binary-Compatibility:-Using-ICU-as-an-Operating-System-Level-Library Personally I'm not planning to work on multi-version installation any time soon, I was just scoping out some basic facts about all this. I think the primary problem that affects most of our users is the shifting-under-your-feet problem, which we now see applies equally to libc and libicu. >> Yeah, it seems like ICU is *also* subject to minor changes that happen >> under your feet, much like libc. For example maintenance release 60.2 >> (you can't install that at the same time as 60.1, but you can install >> it at the same time as 59.2). You'd be linked against libicu.so.60 >> (and thence libicudata.so.60), and it gets upgraded in place when you >> run the local equivalent of apt-get upgrade. > > This always worried me because an unexpected collation change is so painful for a database. And I was never able to thinkof a way of reliably testing compatibility either because of ICU's ability to reorder and group characters when collating. I think the best we can do is to track versions per dependency (ie record it when the CHECK is created, when the index is created or rebuilt, ...) and generate loud warnings until you've dealt with each version dependency. That's why I've suggested we could consider sticking it on pg_depend (though I have apparently failed to convince Stephen so far). I think something like that is better than the current collversion design, which punts the problem to the DBA: "hey, human, there might be some problems, but I don't know where! Please tell me when you've fixed them by running ALTER COLLATION ... REFRESH VERSION!" instead of having the computer track of what actually needs to be done on an object-by-object basis and update the versions one-by-one automatically when the problems are resolved. -- Thomas Munro http://www.enterprisedb.com
pgsql-hackers by date: