Re: Collation version tracking for macOS - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: Collation version tracking for macOS |
Date | |
Msg-id | CA+hUKG+OSQtrRAk-bHwMJmuvqp-b-LGuxsF2PoDBVyQcT+VEAQ@mail.gmail.com Whole thread Raw |
In response to | Re: Collation version tracking for macOS (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: Collation version tracking for macOS
Re: Collation version tracking for macOS |
List | pgsql-hackers |
Replying to Peter and Jeff in one email. On Sat, Nov 12, 2022 at 3:57 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > On 22.10.22 03:22, Thomas Munro wrote: > > I'd love to hear others' thoughts on how we can turn this into a > > workable solution. Hopefully while staying simple... > > I played with this patch a bit. It looks like a reasonable approach. Great news. > Attached is a small patch to get the dynamic libicu* lookup working with > the library naming on macOS. Thanks, squashed. > Instead of packing the ICU version into the locale field ('63:en'), I > would make it a separate field in pg_collation and a separate argument > in CREATE COLLATION. I haven't tried this yet, as I focused on coming up with a way of testing in this iteration. I can try this next. I'm imagining that we'd have pg_collation.collicuversion and pg_database.daticuversion, and they'd default to 0 for "use the GUC", and perhaps you'd even be able to ALTER them. Perhaps we wouldn't even need the GUC then... 0 could mean "the linked version", and if you don't like it, you ALTER it. Thinking about this. > At this point, perhaps it would be good to start building some tests to > demonstrate various upgrade scenarios and to ensure portability. OK, here's what I came up with. You enable it in PG_TEST_EXTRA, and tell it about an alternative ICU version you have in the standard library search path that is not the same as the main/linked one: $ meson configure -DPG_TEST_EXTRA="icu=63" $ meson test icu/020_multiversion Another change from your feedback: you mentioned that RHEL7 shipped with ICU 50, so I removed my suggestion of dropping some extra code we carry for versions before 54 and set the minimum acceptable version to 50. It probably works further back than that, but that's a decent range, I think. On Tue, Nov 15, 2022 at 1:55 PM Jeff Davis <pgsql@j-davis.com> wrote: > I looked at v6. Thanks for jumping in and testing! > * We'll need some clearer instructions on how to build/install extra > ICU versions that might not be provided by the distribution packaging. > For instance, I got a cryptic error until I used --enable-rpath, which > might not be obvious to all users. Suggestions welcome. No docs at all yet... > * Can we have a better error when the library was built with -- > disable-renaming? We can just search for the plain (no suffix) symbol. I threw out that symbol probing logic, and wrote something simpler that should now also work with --disable-renaming (though not tested). Now it does a cross-check with the library's self-reported major version, just to make sure there wasn't a badly named library file, which may be more likely with --disable-renaming. > * We should use dlerror() instead of %m to report dlopen() errors. Fixed. > * It seems like the collation version is just there to issue WARNINGs > when a user is using the non-versioned locale syntax and the library > changes underneath them (or if there is collation version change within > a single ICU major version)? Correct. I have now updated the warning messages you get when they don't match, to provide a hint about what to do about it. I am sure they need some more word-smithing, though. > * How are you testing this? Ad hoc noodling before now, but see attached. > I realize your patch is experimental, but when there is a better > consensus on the approach, we should consider adding declarative syntax > such as: > > CREATE COLLATION (or LOCALE?) PROVIDER icu67 > TYPE icu VERSION '67' AS '/path/to/icui18n.so.67'; > > It will offer more opportunities to catch errors early and offer better > error messages. It would also enable it to function if the library is > built with --disable-renaming (though we'd have to trust the user). Earlier in this and other threads, we wondered if each ICU major version should be a separate provider, which is what you're showing there, or should be an independent property of an individual COLLATION, which is what v6 did with '63:en' and what Peter suggested I make more formal with CREATE COLLATION foo (..., ICU_VERSION=63). I actually started out thinking we'd have multiple providers, but I couldn't really think of any advantage, and I think it makes some upgrade scenarios more painful. Can you elaborate on why you'd want that model? > On Sat, 2022-10-22 at 14:22 +1300, Thomas Munro wrote: > > Problem 1: Suppose you're ready to start using (say) v72. I guess > > you'd use the REFRESH command, which would open the main linked ICU's > > collversion and stamp that into the catalogue, at which point new > > sessions would start using that, and then you'd have to rebuild all > > your indexes (with no help from PG to tell you how to find everything > > that needs to be rebuilt, as belaboured in previous reverted work). > > Aside from the possibility of getting the rebuilding job wrong (as > > belaboured elsewhere), it's not great, because there is still a > > transitional period where you can be using the wrong version for your > > data. So this requires some careful planning and understanding from > > the administrator. > > How is this related to the search-by-collversion design? It seems like > it's hard no matter what. Yeah. I just don't like the way it *appears* to be doing something clever, but it doesn't solve any fundamental problem at all because the collversion information is under human control and so it's really doing something stupid. Hence desire to build something that at least admits that it's primitive and just gives you some controls, in a first version. We could always reconsider that in later work though, maybe even an optional policy or something?
Attachment
pgsql-hackers by date: