Re: Collation version tracking for macOS - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Collation version tracking for macOS
Date
Msg-id CAH2-WznSi6muvAZsocdGCGnACL+NMziyXEkjDfz0n30BYUwS9w@mail.gmail.com
Whole thread Raw
In response to Re: Collation version tracking for macOS  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Collation version tracking for macOS
Re: Collation version tracking for macOS
List pgsql-hackers
On Thu, Jun 9, 2022 at 9:31 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> Perhaps that could be modeled with a pg_depend row pointing to a
> pg_icu_library row, which you'd probably need anyway, to prevent a
> registered ICU library that is needed for a live index from being
> dropped.  (That's assuming that the pg_icu_library catalogue concept
> has legs...  well if we're going with dlopen(), we'll need *somewhere*
> to store the shared object paths.  Perhaps it's not a given that we
> really want paths in a table... I guess it might prevent certain
> cross-OS streaming rep scenarios, but mostly that'd be solvable with
> symlinks...)

Do we even need to store a version for indexes most of the time if
we're versioning ICU itself, as part of the "time travelling
collations" design? For that matter, do we even need to version
collations directly anymore?

I'm pretty sure that the value of pg_collation.collversion is always
the same in practice, or has a lot of redundancy. Because mostly it's
just an ICU version. This is what I see on my system, at least:

pg@regression:5432 [53302]=# select count(*), collversion from
pg_collation where collprovider = 'icu' group by 2;
 count │ collversion
───────┼─────────────
   329 │ 153.112.41
   471 │ 153.112
(2 rows)

(Not sure why there are two different distinct collversion values
offhand, but generally looks like collversion isn't terribly
meaningful at the level of individual pg_collation entries.)

If indexes and constraints with old physical collations are defined as
being the exception to the general rule (the rule meaning "every index
uses the current ICU version for the database as a whole"), and if
those indexes/constraints are enumerated and stored (in a new system
catalog) when a switchover of the database's ICU version is first
initialized, then there might not be any meaningful dependency to
speak of. Not for indexes, at least.

The *database as a whole* is dependent on the current version of ICU
-- it's not any one index. Very occasionally the database will also be
dependent on a single older ICU version that we're still transitioning
away from. There is a "switch-a-roo" going on, but not really at the
level of indexes -- it's a very specialized thing, that works at the
level of the whole database, and involves exactly 2 ICU versions. You
should probably be able to back out of it once it begins, but mostly
it's an inflexible process that just does what we need it to do.

Does something like that seem sensible to you?

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Collation version tracking for macOS
Next
From: Thomas Munro
Date:
Subject: Re: Collation version tracking for macOS