Re: Collation version tracking for macOS - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Collation version tracking for macOS
Date
Msg-id 606bd2baa6d65b38fee6eb23bba40c5da210255b.camel@j-davis.com
Whole thread Raw
In response to Re: Collation version tracking for macOS  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Collation version tracking for macOS
List pgsql-hackers
I looked at v6.

  * We'll need some clearer instructions on how to build/install extra
ICU versions that might not be provided by the distribution packaging.
For instance, I got a cryptic error until I used --enable-rpath, which
might not be obvious to all users.
  * Can we have a better error when the library was built with --
disable-renaming? We can just search for the plain (no suffix) symbol.
  * We should use dlerror() instead of %m to report dlopen() errors.
  * It seems like the collation version is just there to issue WARNINGs
when a user is using the non-versioned locale syntax and the library
changes underneath them (or if there is collation version change within
a single ICU major version)?
  * How are you testing this?
  * In my tests (sort, hacked so abbreviate is always false), I see a
~3% regression for ICU+UTF8. That's fine with me. I assume it's due to
the indirect function call, but that's not obvious to me from the
profile. If it's a major problem we could have a special case of
varstrfastcmp_locale() that works on the compile-time ICU version.

I realize your patch is experimental, but when there is a better
consensus on the approach, we should consider adding declarative syntax
such as:

   CREATE COLLATION (or LOCALE?) PROVIDER icu67
     TYPE icu VERSION '67' AS '/path/to/icui18n.so.67';

It will offer more opportunities to catch errors early and offer better
error messages. It would also enable it to function if the library is
built with --disable-renaming (though we'd have to trust the user).

On Sat, 2022-10-22 at 14:22 +1300, Thomas Munro wrote:
> Problem 1:  Suppose you're ready to start using (say) v72.  I guess
> you'd use the REFRESH command, which would open the main linked ICU's
> collversion and stamp that into the catalogue, at which point new
> sessions would start using that, and then you'd have to rebuild all
> your indexes (with no help from PG to tell you how to find everything
> that needs to be rebuilt, as belaboured in previous reverted work).
> Aside from the possibility of getting the rebuilding job wrong (as
> belaboured elsewhere), it's not great, because there is still a
> transitional period where you can be using the wrong version for your
> data.  So this requires some careful planning and understanding from
> the administrator.

How is this related to the search-by-collversion design? It seems like
it's hard no matter what.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Meson doesn't define HAVE_LOCALE_T for mscv
Next
From: Michael Paquier
Date:
Subject: Re: Avoid overhead open-close indexes (catalog updates)