Re: Collation version tracking for macOS - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Collation version tracking for macOS |
Date | |
Msg-id | CA+hUKGLQD8y+S5rUTKm6wS_CZ++TN3bM3cr9UoxkhacBmmj1Dw@mail.gmail.com Whole thread Raw |
In response to | Re: Collation version tracking for macOS (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Collation version tracking for macOS
|
List | pgsql-hackers |
On Sat, Jun 11, 2022 at 4:21 PM Peter Geoghegan <pg@bowt.ie> wrote: > What about "time travel collations", but without the time travel part? > That is, what about supporting multiple ICU versions per cluster, but > not per database? So you could upgrade the OS and Postgres, using > standard packages that typically just use the latest ICU version -- > typically, but not always. If you happen to have been on an older > version of ICU on upgrade, then that version of ICU will still work at > the level of a whole database -- your database. Maybe you can create > new databases with old and new ICU versions if you want to. > > That obviously runs into the problem of needing to eventually do a > dump and reload -- but I suppose that "eventually" could be a very > long time. At least the OS package doesn't declare one version of ICU > the blessed version, now and forever, effectively vendoring ICU in a > backdoor fashion. At least old databases have significant runway, > while at the same time new databases that want to use the same > standard Postgres package aren't forced to use the same ancient ICU > version. Hmm. I think that's effectively what you'd get using my "distinct collation" patch (v1, or this much better v3, attached), if you put version prefixes in colliculocale, and updated them in the template database after an OS upgrade to affect new databases. I realise you probably mean something a little more automatic... I think "pinned forever" ICU versions would be useful, because I think there are very few expert users who want fine generalised control over version changes, but almost all other users don't care at all about any of this stuff -- as long as their indexes keep indexing and their constraints keep constraining. So I think you can make a lot of people happy by ignoring the complexities of upgrades and providing a way to nail the version down for the lifetime of the database. Also, it's not *impossible* to move to a later ICU, it's just a bit tricky; the key point is that it's under your control if you want to do that, independently of an OS upgrade, as you said. Based on my reading of that DB2 manual page, I reckon my v3 "distinct collation" patch is about as good as what they have. If you don't choose to use prefixes then later OS upgrades (ie upgrades that change the version of ICU that PostgreSQL is linked against) might corrupt your indexes and constraints -- I think that's what they're saying -- though at least we'll try to warn about that with our weak warning system. If you do choose to use prefixes you'll be stuck on that ICU version forever, even across updates that cause PostgreSQL to be linked to future releases of ICU, unless you're prepared to do a whole bunch of careful upgrading work (perhaps with some future tooling to help with that). Attached is a much more fleshed out version of the "distinct collation" patch (to be clear: v3 descends from v1, while v2 was at attempt at the timelord approach). Main changes: 1. I now also route strTo{Upper,Lower,Title} to the right version of ICU. That requires dlopen'ing a second library. 2. You create distinct collations with optional ICU major version prefixes, along the lines of what Jim was suggesting. For example (and I'm not actually proposing -x-icu67 suffixes, just avoiding a collision in my example): postgres=# create collation "en-x-icu67" (provider = icu , locale = '67:en'); CREATE COLLATION postgres=# create collation "en-x-icu63" (provider = icu , locale = '63:en'); CREATE COLLATION 3. For versions other than the one we are linked against, it tries to open libraries with typical names inside $libdir. An administrator could drop symlinks in there like so: $ ln -s /usr/lib/x86_64-linux-gnu/libicui18n.so.63 ~/install/lib/postgresql/ $ ln -s /usr/lib/x86_64-linux-gnu/libicuuc.so.63 ~/install/lib/postgresql/ What I like about this $libdir scheme is that I imagine that we could ask our friends in the packaging teams to create packages for that. Users would then think of them in much the same way as extensions. You'd just type: $ sudo apt-get install postgresql-16-icu71 Happy to keep trying to figure out the competing and rather more ambitious TT version too (that I sketched some bits of in v2), but I'm a whole lot fuzzier on how that can work and kinda stuck on the problems I raised.
Attachment
pgsql-hackers by date: