Re: Collation versioning - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Collation versioning
Date
Msg-id CAH2-WzkH_XwzHjG_zAnOtmRVnJeB8T6PUKVEJvy8yk=pFWGoZg@mail.gmail.com
Whole thread Raw
In response to Re: Collation versioning  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Mon, Sep 24, 2018 at 1:47 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Personally I'm not planning to work on multi-version installation any
> time soon, I was just scoping out some basic facts about all this.  I
> think the primary problem that affects most of our users is the
> shifting-under-your-feet problem, which we now see applies equally to
> libc and libicu.

Are we sure about that? Could it just be that ICU will fix bugs that
cause their strcoll()-alike and strxfrm()-alike functions to give
behavior that isn't consistent with the behavior required by the CLDR
version in use?

This seems like it might be a very useful distinction. We know that
glibc had bugs that were caused by strxfrm() not agreeing with
strcoll() -- that was behind the 9.5-era abbreviated keys issues. But
that was actually a bug in an optimization in strcoll(), rather than a
strxfrm() bug. strxfrm() gave the correct answer, which is to say the
answer that was right according to the high level collation
definition. It merely failed to be bug-compatible with strcoll().
What's ICU supposed to do about an issue like that?

If we're going to continue to rely on the strxfrm() equivalent from
ICU, then it seems to me that ICU should be able to change behaviors
in a stable release, provided the behavior they're changing is down to
a bug in their infrastructure, as opposed to an organic evolution in
how some locale sorts text (CLDR update). My understanding is that ICU
is designed to decouple technical issues with issues of concern to
natural language experts, so we as an ICU client can limit ourselves
to worrying about one of the two at any given time.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pgsql: Improve autovacuum logging for aggressive andanti-wraparound ru
Next
From: Tom Lane
Date:
Subject: Re: PATCH: Update snowball stemmers