Re: Collation version tracking for macOS - Mailing list pgsql-hackers
From | Peter Eisentraut |
---|---|
Subject | Re: Collation version tracking for macOS |
Date | |
Msg-id | 0e1bcd64-b32b-a9ef-4d65-fe420d10e5b3@enterprisedb.com Whole thread Raw |
In response to | Re: Collation version tracking for macOS (Thomas Munro <thomas.munro@gmail.com>) |
List | pgsql-hackers |
On 05.12.22 22:33, Thomas Munro wrote: > On Tue, Dec 6, 2022 at 6:45 AM Joe Conway <mail@joeconway.com> wrote: >> On 12/5/22 12:41, Jeff Davis wrote: >>> On Mon, 2022-12-05 at 16:12 +1300, Thomas Munro wrote: >>>> 1. I think we should seriously consider provider = ICU63. I still >>>> think search-by-collversion is a little too magical, even though it >>>> clearly can be made to work. Of the non-magical systems, I think >>>> encoding the choice of library into the provider name would avoid the >>>> need to add a second confusing "X_version" concept alongside our >>>> existing "X_version" columns in catalogues and DDL syntax, while >>>> still >>>> making it super clear what is going on. >>> >>> As I understand it, this is #2 in your previous list? >>> >>> Can we put the naming of the provider into the hands of the user, e.g.: >>> >>> CREATE COLLATION PROVIDER icu63 TYPE icu >>> AS '/path/to/libicui18n.so.63', '/path/to/libicuuc.so.63'; >>> >>> In this model, icu would be a "provider kind" and icu63 would be the >>> specific provider, which is named by the user. >>> >>> That seems like the least magical approach, to me. We need an ICU >>> library; the administrator gives us one that looks like ICU; and we're >>> happy. >> >> +1 >> >> I like this. The provider kind defines which path we take in our code, >> and the specific library unambiguously defines a specific collation >> behavior (I think, ignoring bugs?) > > OK, I'm going to see what happens if I try to wrangle that stuff into > a new catalogue table. I'm reviewing the commit fest entry https://commitfest.postgresql.org/41/3956/, which points to this thread. It appears that the above patch did not come about in time. The patch of record is now Jeff's refactoring patch, which is also tracked in another commit fest entry (https://commitfest.postgresql.org/41/4058/). So as a matter of procedure, we should probably close this commit fest entry for now. (Maybe we should also use a different thread subject in the future.) I have a few quick comments on the above syntax example: There is currently a bunch of locale-using code that selects different code paths by "collation provider", i.e., a libc-based code path and an ICU-based code path (and sometimes also a default provider path). The above proposal would shift the terminology and would probably require some churn at those sites, in that they would now have to select by "collation provider type". We could probably avoid that by shifting the terms a bit, so instead of the suggested provider type -> provider we could use provider -> version of that provider (or some other actual term), which would leave the meaning of "provider" unchanged as far as locale-using code is concerned. At least that's my expectation, since no code for this has been seen yet. We should keep this in mind in any case. Also, the above example exposes a lot of operating system level details. This creates issues with dump/restore, which some of the earlier patches avoided by using a path-based approach, and it would also require some thoughts about permissions. We probably want non-superusers to be able to interact with this system somehow, for upgrading (for some meaning of that action) indexes etc. without superuser access. The more stuff from the OS we expose, the more stuff we have to be able to lock down again in a usable manner. (The search-by-collversion approach can probably avoid those issues better.)
pgsql-hackers by date: