Re: Collation version tracking for macOS - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Collation version tracking for macOS |
Date | |
Msg-id | CA+hUKG+PNqUn5oG6hFgPcy7AyxuSbpNKo-u=Bobe=dn7k8sVZw@mail.gmail.com Whole thread Raw |
In response to | Re: Collation version tracking for macOS (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Collation version tracking for macOS
|
List | pgsql-hackers |
On Wed, Jun 8, 2022 at 10:59 AM Peter Geoghegan <pg@bowt.ie> wrote: > On Tue, Jun 7, 2022 at 3:27 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > Yeah, it's possible to link against multiple versions in theory and > > that might be a way to do it if we were shipping our own N copies of > > ICU like DB2 does, but that's hard in practice for shared libraries on > > common distros (and vendoring or static linking of such libraries was > > said to be against many distros' rules, since it would be a nightmare > > if everyone did that, though I don't have a citation for that). > > I'm not saying that it's going to be easy, but I can't see why it > should be impossible. I use Debian unstable for most of my work. It > supports multiple versions of LLVM/clang, not just one (though there > is a virtual package with a default version, I believe). What's the > difference, really? The difference is that Debian has libllvm-{11,12,13,14}-dev packages, but it does *not* have multiple -dev packages for libicu, just a single libicu-dev which can be used to compile and link against their chosen current library version. They do have multiple packages for the actual .so and allow them to be installed concurrently. Therefore, you could install N .sos and dlopen() them, but you *can't* write a program that compiles and links against N versions at the same time using their packages (despite IBM's work to make that possible, perhaps for use in their own databases). > Packaging standards certainly matter, but they're not immutable laws > of the universe. It seems reasonable to suppose that the people that > define these standards would be willing to hear us out -- this is > hardly a trifling matter, or something that only affects a small > minority of *their* users. OK, yeah, I'm thinking within the confines of things we can do easily right now on existing systems as they are currently packaging software only by changing our code, not "tell Debian to change their packaging so we can compile and link against N versions". Supposing Debian maintainers (and all the others) agreed, there'd still something else in favour of dlopen(): wouldn't it be nice if the users were not limited by the versions that the packager of PostgreSQL decided to link against? What if someone has a good reason to want to use ICU versions that are older than Debian currently ships, that are easily available in add-on repos? > > Yeah, I've flip-flopped a couple of times on the question of whether > > ICU63 and ICU67 should be different collation providers, or > > individual collations should somehow specify the library they want to > > use (admittedly what I showed above with a raw library name is pretty > > ugly and some indirection scheme might be nice). It would be good to > > drill into the pros and cons of those two choices. > > I think that there are pretty good technical reasons why each ICU > version is tied to a particular version of CLDR. Implementing CLDR > correctly and efficiently is a rather difficult process, even if we > ignore figuring out what natural language rules make sense. And so > linking to multiple different ICU versions doesn't really seem like > overkill to me. Or if it is then I can easily think of far better > examples of software bloat. Defining "stable behavior for collations" > as "uses exactly the same software artifact over time" is defensive > (compared to always linking to one ICU version that does it all), but > we have plenty that we need to defend against here. I think we're not understanding each other here: I was talking about the technical choice of whether we'd model the multiple library versions in our catalogues as different "collprovider" values, or somehow encode them into the "collcollate" string, or something else. I'm with you, I'm already sold on the mult-library concept (and have been in several previous cycles of this recurring discussion), which is why I'm trying to move to discussing nuts and bolts and packaging and linking realities that apparently stopped any prototype from appearing last time around.
pgsql-hackers by date: